OpenAI and Anthropic shift language model pricing metrics: in 2026, task cost matters

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

Apr 28, 2026. Reading time: 3 min.

The LLM market's fundamental metric is shifting. OpenAI is moving enterprise plans to a more flexible usage-based pricing model, while Anthropic is dropping…

Hamidun News Editorial

AI monitoring · Habr AI

Apr 28, 2026· 2 min

AI-processed from Habr AI; edited by Hamidun News

OpenAI and Anthropic shift language model pricing metrics: in 2026, task cost matters — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

The large language model market is entering a new stage: a cheaper token no longer means a predictable budget. Almost simultaneously, OpenAI and Anthropic have demonstrated that in 2026, businesses will need to calculate not only the price per million tokens, but also the full cost of completing a task. For companies building products on agentic scenarios, this changes the very logic of procurement, planning, and unit economics.

The first signal came from Anthropic. The company moved its agentic frameworks to usage-based billing, meaning payment for actual token consumption instead of fixed subscriptions. In practice, this means that some external wrappers and services that previously could operate on a flat-rate model lose their former financial foundation. While the load was relatively predictable, subscription seemed convenient for both provider and client. But in agentic systems, computational costs grow rapidly: the model doesn't simply respond to one request, but plans steps, makes multiple calls, accesses tools, double-checks results, and can launch a long chain of actions.

In parallel, OpenAI changed its approach for corporate clients. In Enterprise, Business, and EDU plans, the company introduced more flexible pricing, where cost scales with usage volume rather than remaining rigidly tied to seat licenses. For procurement teams, this is an important shift. Until recently, one could view a subscription as an almost fixed expense item, but now the model becomes closer to cloud services: the payment depends much more on actual usage intensity.

The more actively employees engage generation, search, document analysis, and agentic functions, the more noticeably the bill changes.

This doesn't cancel out another trend the market has observed over the last two years. From 2023 to 2025, APIs did become cheaper, and the cost per million tokens for GPT-4-class models declined. This is why many teams got used to thinking by a simple rule: if token price falls, then LLM implementation automatically becomes more profitable over time.

In 2026, this rule no longer works without caveats. The key metric now is not the price per token itself, but the cost of solving a specific task. If one useful result requires the system to make multiple passes, use long context, make tool calls, perform additional checks, and regenerate multiple times, the total bill can grow even against the backdrop of a formally cheaper API.

This is especially noticeable in agentic products, where one scenario that looks to the user like a single action can internally break down into dozens of model operations.

From this follows a practical conclusion for teams. LLM budgeting now needs to be built around the cost of completed action: how much does one report cost, one document analysis, one assistant session, or one successfully executed agentic workflow. From here grow new product requirements: elimination of unnecessary steps, control of agentic reasoning depth, context reduction, caching, routing to cheaper models where permissible, and rigorous measurement of which calls truly create value.

For CTOs, CPOs, and financial teams, this means a transition from talking about "cheap AI" to proper operational economics, where what matters is not a beautiful price in a table, but the cost of a specific business result.

The main point of this shift is that the LLM market has not stopped getting cheaper, but has stopped being naively simple. Compute crunch in 2026 is not only a question of available capacity, but also of managing expenses. The winners will not be companies looking at the lowest price per token, but those who can calculate the cost of the end result and design systems so that each additional token brings measurable value.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation