Wired→ original

Claude и AI-агенты сжигают токены быстрее прогнозов — бизнес учится «токеномике»

Компании, внедрившие AI-агентов для написания кода, столкнулись с «безумным» расходом токенов — в разы превышающим прогнозы. Silicon Valley–разработчик ПО и…

AI-processed from Wired; edited by Hamidun News
Claude и AI-агенты сжигают токены быстрее прогнозов — бизнес учится «токеномике»
Source: Wired. Collage: Hamidun News.
◐ Listen to article

Companies that deployed AI agents for code writing have encountered an unexpected problem: actual token costs turned out to be significantly higher than forecasted — and now their executives are urgently learning a new discipline called "tokenomics."

What Happened to Budgets

Wired spoke with several companies about how they manage AI agent expenses in real conditions. One source — a director at a Silicon Valley software development company — described what is happening with the word "insane": his team started actively using Claude from Anthropic for coding, and token consumption skyrocketed to levels no one had budgeted for. A similar situation at an e-commerce company: AI agents working in the background generate thousands of tokens for tasks that seem routine — code review, test writing, refactoring. If a developer spends an hour on such a task, the agent "thinks" for several minutes, but continuously generates tokens in the process — and the monthly bill turns out to be completely different.

Why Agents Burn Tokens So Fast

A token is a unit of billing with AI providers. Every request to a model and every response is charged based on token count. For regular chat this is barely noticeable. But AI agents for coding work on a fundamentally different principle:

  • before each action they read the entire repository context
  • within a single task they call the model repeatedly
  • generate long chains of internal reasoning before responding
  • write, test, and rewrite code until achieving the desired result

As a result, a task that a developer solves in an hour can "cost" tens of thousands of tokens. At high rates for powerful models like Claude — this is hundreds of dollars per working day per employee.

How Companies Are Restructuring Their Approach

Executives are beginning to introduce the concept of "tokenomics" — managing token consumption the same way they used to manage server resources or cloud spending. First practices have already formed:

  • limiting agent context windows: agents see only the relevant part of the codebase
  • caching repeated prompts so tokens aren't recalculated from scratch
  • task routing: cheap models for routine work, powerful ones for complex requests
  • monitoring and alerts for abnormal spending
  • reassessing ROI from AI tools based on actual, not forecasted costs
"We used to think of AI as a SaaS subscription with fixed pricing.

Now we understand it's more like cloud computing: price depends on how much you use."

Anthropic and other providers offer tools for spending monitoring, but token management remains a headache on the client side.

What This Means

A business bet on AI coding as a way to reduce development expenses may not pay off if the true cost of tokens isn't accounted for. Companies that first master "tokenomics" — learning to optimize consumption without sacrificing results — will gain a tangible advantage over those managing AI costs blindly.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…