OpenAI, Google, and Anthropic Accelerate AI Model Race, but the Market Is Already Tired of the Noise
In February 2026, OpenAI, Google, Anthropic, xAI, and Chinese labs released dozens of new models in rapid succession. But the real shift isn't an extra 2% on…
AI-processed from Habr AI; edited by Hamidun News
February 2026 became a conveyor belt of releases: OpenAI, Google, Anthropic, xAI, and Chinese labs were releasing new models with days between them, and the market already counts hundreds of LLMs from dozens of organizations. Against this backdrop, the fact of the next announcement matters less than the question of what from this race actually changes how people and companies work.
Why Everything Accelerated
Three years ago, there were months between major releases, and now — sometimes just two days, like between GPT-5.3 and GPT-5.4.
The market already has over 500 language models from 30+ organizations, and this clearly shows the scale of acceleration. There are several reasons. First, the race stopped being a duel between OpenAI and Google: Anthropic, xAI, Meta, Mistral, and Chinese players like DeepSeek, Qwen, Zhipu, and ByteDance have fully joined in.
Second, compute has become cheaper: efficient architectures and new hardware have reduced training and inference costs. Third, the leaders have enormous funding that allows them to run multiple teams in parallel and develop different model lines simultaneously. Open source is a separate accelerator.
When Meta, Mistral, and DeepSeek release models with open weights, proprietary labs have to more often prove what users are paying for with their subscriptions. Chinese companies stand out here especially: due to chip restrictions, they're forced to find more cost-effective training methods, and these solutions quickly end up in the open ecosystem. As a result, the market lives in a mode of constant mutual pressure: closed models are released faster, open models catch up faster, and users get increasingly cheaper and more powerful tools.
Benchmarks Don't Equal Usefulness
On paper, everything looks impressive. Gemini 3.1 Pro sets records on GPQA and ARC-AGI-2, Claude Sonnet 4.6 outperforms even the more expensive Opus 4.6 on office tasks, and GPT-5.4 leads in coding and agentic scenarios. But the gap between the best models is no longer as dramatic as it was during GPT-4 times. On most practical tasks, it's not a chasm, but a few percentage points that are rarely felt by the end user. For a team building a product, the choice increasingly comes down not to the absolute quality of the answer, but to token price, latency, stability, and API convenience.
There's also a more unpleasant problem: benchmarks only measure the conditions baked into them. Solving a physics exam question or passing a code generation test is a useful signal, but it doesn't mean handling murky, incomplete, and context-dependent business tasks well. That's why a 2% record isn't equal to doubling practical value.
It's no coincidence that the main advice in this race sounds like this:
Don't chase the latest model — chase the result.
Then comes production reality. There are many pilots, but few mature implementations: only 11% of companies have brought AI agents to full production, although 38% are already experimenting with pilots. Executives acknowledge productivity gains, but much less often can show strong ROI or business model change. Universal agents still make mistakes, get stuck in loops, and work poorly without oversight. Hence the growing AI fatigue: the market is tired of promises that look better in demos than in real operations.
Where the Effect Is Already Visible
At the same time, usefulness exists, and it's quite measurable. In development, specialized models accelerate code generation and refactoring, and assistants inside IDEs have long since become working tools, not toys. In document analysis, large context windows allow processing contracts, reports, and research materials in a single pass, leaving humans the final check. A separate front is science: reasoning-models help find new structures in mathematics, accelerate drug discovery and materials analysis. Plus the market is rapidly moving toward cost-efficiency: today the record of a model matters no less than the price per useful result.
- Code generation and review
- Processing long documents and reports
- Scientific calculations and search for new hypotheses
- Cheap lite-models for mass scenarios
The most underestimated shift of 2026 is the cost reduction of powerful models. When Sonnet-level solutions approach Opus, and fast versions like Flash-Lite cut price and latency by an order of magnitude, AI stops being a privilege of large teams. This opens scenarios that simply didn't add up economically before: mass processing of customer inquiries, cheap first-pass for lawyers and analysts, automation of internal documentation, custom assistants on company data. And this is where competition for usefulness, not marketing, really begins.
What It Means
The model race in 2026 is both real progress and a layer of loud marketing on top of it. What you should watch is not who's first in the table today, but which models are cheaper, more reliable, and better solve a specific task in production.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.