Google Gemini 3 Deep Think: new record in general intelligence tests

Q: What is the source?

Originally published on MarkTechPost. Hamidun News processes and adapts the material with AI.

Q: When was it published?

2026-02-13. Reading time: 2 min.

Google announced an update to Gemini 3 Deep Think focused on science and engineering. The key achievement was a score of 84.6% on the ARC-AGI-2 benchmark, consi

Hamidun News Editorial

AI monitoring · MarkTechPost

2026-02-13· 2 min

AI-processed from MarkTechPost; edited by Hamidun News

Google Gemini 3 Deep Think: new record in general intelligence tests — Source: MarkTechPost. Collage: Hamidun News.

◐ Listen to article

Google is crossing a new frontier in artificial intelligence development. The company announced an update to Gemini 3 Deep Think, which achieves a score of 84.6% on the ARC-AGI-2 benchmark — a test that scientists regard as the final serious barrier before achieving general intelligence. This is not merely another incremental model improvement, but a fundamental shift in how artificial neural networks solve complex problems. Instead of simple text generation, the system now uses a deep reasoning mode with internal verification, allowing the machine to check its own logic in real time.

To understand the significance of this step, it's worth recalling what happened in the industry over the past few years. Large language models like GPT and Claude excel at text generation, but often stumble on tasks requiring multi-step logical inference and result verification. ARC-AGI-2 was specifically designed by researchers as a test that resists simple model scaling — it's a set of logic and abstract reasoning tasks that require actual reasoning, not just predicting the next word. A score of 84.6% means that Gemini 3 Deep Think solves four out of five such tasks correctly, something that was previously impossible even for the most powerful systems.

Technically, this is achieved through a new mechanism of internal reasoning. The model no longer rushes to provide an answer, but goes through several stages of deliberation, checking each step of logic before formulating the final answer. It's similar to how a mathematician doesn't just state the answer, but works through the problem step by step, verifying each calculation. Google has built into the model an ability for self-verification, which radically reduces the probability of logical errors. This approach is applicable not only to abstract puzzles, but also to real scientific and engineering tasks requiring deep analysis and hypothesis verification.

This is precisely why Google positions this update as a tool for science and engineering, rather than entertainment. The model is now capable of assisting researchers in designing complex systems, verifying scientific hypotheses, and solving engineering problems that require multi-level analysis. This could accelerate the development of new materials, drugs, microchip architectures, and other complex systems, where each error costs significant time and money.

What does this mean for the future of AGI — general artificial intelligence? A score of 84.6% on ARC-AGI-2 is not a finish line, but a clear signal that we are moving not toward ever more powerful text generators, but toward systems capable of genuine reasoning. This is a paradigm that differs from what has dominated the past several years. The shift from scaling to trillions of parameters to implementing verification mechanisms and step-by-step reasoning may be exactly what's needed for further progress.

However, we should be careful with our formulations. A high score on a single benchmark does not mean that AGI is already here. ARC-AGI-2 tests a specific type of intelligence — logical abstract thinking. Real general intelligence will require success on many other fronts: understanding context, dealing with uncertainty, adapting to new situations. Nevertheless, Google's achievement shows that the path to this is becoming clearer. Models are learning not just to generate, but to think, verify, and justify.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation