DeepMind proposed ten cognitive scales for measuring progress toward AGI
Google DeepMind published "Measuring Progress Toward AGI" — a follow-up to its 2023 AGI levels classification. Instead of a single rating, it offers ten…
AI-processed from Habr AI; edited by Hamidun News
Google DeepMind has published a paper titled "Measuring Progress Toward AGI" — an attempt to provide the industry with a tool for genuinely measuring progress toward AGI, rather than yet another classification system with no way to verify it.
Where the problem came from
Nearly three years ago, DeepMind published "Levels of AGI" — a system of five levels of intelligence (from initial to superhuman) and six levels of autonomy (from simple tool to fully autonomous agent). The analogy with autonomous driving levels turned out to be apt: structured, visual, convenient for explaining to investors and journalists. The industry gained a common vocabulary — something like unified terminology for talking about AGI.
But the classification revealed a fundamental flaw: there was no tool to verify where any given system actually stood. Each company could call its model "level 2" or "level 3," and no one had a way to dispute it. "AGI" became a marketing label — convenient for press releases and attracting investment, but completely inconvenient for science.
This new work attempts to solve this very problem.
Ten scales instead of one score
The paper, released in March 2026, proposes a fundamentally different approach. Instead of a single overall rating — ten separate scales, each measuring a specific aspect of cognitive abilities. Moreover, the scales are independent: a system can show a high result in reasoning but low in adaptation to new tasks — and this mismatch will be clearly visible, not hidden behind an averaged value. This approach provides a multidimensional portrait of a system, not a single number.
The fundamental difference from conventional benchmarking: the scales are built not on datasets and problem sets, but on cognitive psychology tools — a science that has for decades researched intelligence in real people and developed methodologies resistant to training effects.
Among the measured aspects:
- Working memory and context retention
- Planning and multi-step reasoning
- Transfer of knowledge to new domains
- Learning from a small number of examples (few-shot)
- Meta-cognition — understanding the boundaries of one's own knowledge
- Causal reasoning
- Adaptation to data outside the training distribution
The authors position the framework as a starting point for discussion, not a final standard. The list of scales is open for expansion.
Why this matters more than benchmarks
Until now, progress in AI has been measured indirectly: MMLU, HumanEval, ARC-Challenge, GSM8K. The problem is that models have learned to deliberately "overfit" to specific benchmarks. A high score on MMLU ceased long ago to be a reliable indicator of actual reasoning — and everyone in the industry knows this, but standards don't change. The cognitive-psychological approach is significantly harder to fool. If a model can't generalize to fundamentally new tasks — no additional training on the test set will hide this. Methodologies developed to measure intelligence in humans are by their very nature resistant to "gaming" the system.
For investors, corporate AI buyers, and regulators, this potentially means the end of the era when any laboratory could announce an "AGI breakthrough" without the possibility of independent verification. Common measurable scales create comparability between systems from different companies, and thus — accountability.
What this means
DeepMind is shifting the conversation about AGI from "we have level N" to "here's specifically how this can be measured." This is not an answer about AGI timelines and not a guarantee of consensus — different laboratories will interpret the scales differently. But it is the first serious step toward common evaluation standards, built on science rather than marketing.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.