DeepMind: More agents, worse results?
DeepMind has published research that challenges the popular idea that increasing the number of AI agents in a system improves the final result. According to the
AI-processed from Jiqizhixin (机器之心); edited by Hamidun News
The artificial intelligence industry has long lived by a simple logic: more means better. More parameters, more data, more computing power. Now DeepMind is challenging the next turn of this idea—the belief that increasing the number of AI agents within a system automatically leads to an increase in its capabilities. The company's new research suggests that multi-agent architectures have a structural ceiling, and approaching it could be costly for those who have bet on agent scaling as the main path forward.
The idea of multi-agent systems is not new in itself. Over the past two years, leading laboratories—OpenAI, Anthropic, Google—have been actively promoting a concept in which several specialized AI agents work together, distributing tasks and double-checking each other's results. It was assumed that such an architecture mimics collective intelligence: one agent writes code, another tests it, a third looks for errors, and an overall coordinator oversees the process. The logic seemed flawless—until DeepMind started measuring what actually happens in practice.
The research found a counterintuitive effect: starting from a certain threshold, adding new agents does not improve the result but worsens it. The reason lies in coordination costs. Each new agent in the system is not merely additional computing power, but also a new source of potential contradictions. Agents must coordinate intermediate results, pass context, and resolve conflicts of interpretation. With a small number of participants, these overhead costs are insignificant. As their number grows, they begin to consume the very gains the system was created for. At some point, the system stops being an orchestra and becomes a crowd.
Technically, the problem is compounded by the fact that modern language models lack a reliable mechanism for resolving contradictions between agents. When two agents reach different conclusions—and this happens more often the more complex the task—the system requires either an arbiter, a voting protocol, or a rollback to one of the variants. Each of these approaches introduces its own distortions. An arbiter can make mistakes themselves. Majority voting kills non-standard but correct solutions. A rollback means that part of the work was done in vain. All of this—not bugs in specific implementations, but fundamental properties of distributed systems that engineers have been fighting for decades even in classical software.
For the industry, this discovery has serious practical consequences. Startups and large companies have invested significant resources in building so-called agent frameworks—Microsoft's AutoGen, CrewAI, LangGraph, and many other tools oriented specifically at orchestrating a large number of agents. The thesis that agent scaling compensates for the limitations of individual models has become almost a dogma in technology pitch decks. If DeepMind is right, some of these architectural solutions will need to be reconsidered not in years, but now.
At the same time, it is important not to overestimate the pessimism of the research. The "agent ceiling" is not a death sentence for the multi-agent approach as such, but rather an indication that scaling should be smart, not mechanical. Systems with a small number of well-specialized agents, clearly divided zones of responsibility, and minimal task overlap continue to demonstrate real productivity gains. The problem arises when developers add agents on the principle of "more is better" without considering how coordination between them is organized.
DeepMind's discovery fits into a broader discussion about scaling limitations in AI, which has noticeably intensified in recent months. After several years in which increased computing power almost automatically provided quality improvements, the industry is increasingly facing diminishing returns—whether in pretraining large models or, now, in agent architectures. This does not mean progress has stopped. It means that simple recipes no longer work and must be replaced by qualitative architectural solutions. For laboratories competing for leadership in the age of agent AI, DeepMind's results are not a cause for panic, but a reason to seriously reconsider how exactly they plan to build next-generation systems.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.