Futurism→ original

Collapse of AI agents: mathematics proved they'll never be reliable

Пока инвесторы вливают миллиарды в концепцию «агентов», которые якобы заменят сотрудников, математики из Оксфорда и других институтов вынесли приговор. Оказывае

AI-processed from Futurism; edited by Hamidun News
Collapse of AI agents: mathematics proved they'll never be reliable
Source: Futurism. Collage: Hamidun News.
◐ Listen to article

Remember how last year everyone suddenly stopped talking about chatbots and started dreaming about "agents"? We were promised that AI would soon book tickets on its own, write code for entire departments, and manage supply chains while we sip smoothies. It turns out mathematics has a different opinion on the matter, and a rather unpleasant one for those who have already rewritten their business plans around "full autonomy."

A group of researchers published work that strikes at the most painful point in the industry: the mathematical impossibility of reliable autonomous AI agents. The problem isn't that the models are "stupid" or lack training data from Reddit. The problem lies in the very structure of sequential tasks. If you ask AI to do something in one step, the probability of success is high. But as soon as you string together a chain of ten actions, the merciless terror of probability theory begins.

Imagine each step of an agent executes with 95% accuracy. Sounds pretty good, right? In the human world, that's straight-A student level. But in a chain of ten steps, the overall probability of success drops below 60%. And if there are a hundred steps? The chance that the agent will reach the end without turning your project into digital garbage approaches zero. This is called "catastrophic error accumulation," and apparently, it's not curable by simply increasing the context window or adding another batch of H100 graphics cards.

The industry is now in an extremely strange position. On one hand—venture capitalists pouring billions into startups like Cognition, promising "the world's first AI engineer." On the other hand—dry mathematics saying: "Guys, this won't work the way you're drawing it in your presentations." We're trying to build a skyscraper on a swamp, hoping that if we make the facade prettier, the foundation will strengthen itself.

The most ironic thing here is that companies continue to sell "autonomy" as the main feature. But in reality, we get systems that need constant oversight. This isn't liberation from routine, but a new form of supervision where humans become eternal correctors of a semi-insane algorithm. If an agent makes mistakes 5% of the time, but does so silently and with absolute confidence in its correctness, it becomes more dangerous than the laziest and most incompetent employee. A human's mistake can be predicted; a statistical model's mistake cannot.

We used to marvel at how neural networks write poems or explain quantum physics. It was fun, but practically useless for real production. Then came the era of agents. The idea was simple: give the model tools—a browser, a terminal, API access—and let it act. This turned AI from a "smart parrot" into a "digital intern." But as it turned out, this intern suffers from a severe form of progressive inattentiveness, baked into it at the formula level.

What does this mean for us in the near term? Most likely, the era of "press the button—get the result" is postponed indefinitely. We'll have to rethink our approach to AI architecture: move from full autonomy to tightly controlled modules, where each step is verified not by another AI, but by formal methods or a live human. Mathematics can't be fooled by flashy demos on X (formerly Twitter). The path to true intelligence capable of reliable action lies through understanding causal relationships, not merely guessing the next token.

The bottom line: AI agents in their current form are a statistical trap. Until we solve the error accumulation problem, "autonomy" will remain merely an expensive and unreliable attraction for investors. Are we waiting for the hype to finally collide with reality?

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…