Google AI Blog→ original

Gemini 3 and the Bluff: Why Neural Networks Are Playing Poker Now

Эпоха скучных тестов MMLU подходит к концу. Google расширяет платформу Game Arena, добавляя туда покер и игру «Оборотень» (аналог «Мафии»). Это не просто забава

AI-processed from Google AI Blog; edited by Hamidun News
Gemini 3 and the Bluff: Why Neural Networks Are Playing Poker Now
Source: Google AI Blog. Collage: Hamidun News.
◐ Listen to article

It's time to admit the obvious: traditional benchmarks for neural networks are dying. When a model shows 90% accuracy on the MMLU test, we no longer know whether it actually got smarter or simply happened to memorize answers from the training set. The industry is desperately searching for ways to test "living" intelligence, and Google decided that the best way to do this is to send AI to the poker table. Expanding the Game Arena platform with new disciplines like poker and Werewolf looks like an attempt to finally pull models out of sterile laboratory conditions and into the chaos of social interactions.

The history of AI and games has always been a measure of progress. First came Deep Blue, which defeated Kasparov through sheer computational power. Then came AlphaGo, demonstrating intuition in situations where the number of possible moves exceeds the atoms in the universe. But chess and Go are games of perfect information. You see everything your opponent sees. Poker and Werewolf are a different league entirely. Here you need to account for hidden cards, bluff, and most importantly, build a model of your opponent's psychology. If Gemini 3 Pro can convince a group of people that it's a peaceful villager while actually being a "wolf," that will tell us more about its cognitive abilities than any academic test.

Current results in Game Arena show that the Gemini 3 family feels right at home in this environment. The Pro and Flash models have already topped the chess leaderboard, surpassing competitors in their ability to plan many moves ahead. But chess for modern LLMs is already a solved problem. The real challenge begins now, when they must contend with the irrationality of human behavior in poker. Here it's not enough to simply calculate the probabilities of getting the right card. You need to understand why your opponent suddenly raised the stakes: do they actually have a royal flush, or are they just hoping you'll get scared?

Why does this matter to us, not just to gambling enthusiasts? The reason is that the skills needed to win at Werewolf translate directly to the real world. Contract negotiations, diplomacy, personnel management — all of these are games with incomplete information and elements of bluffing. If Google succeeds in training models that effectively handle such tasks, we won't just get chatbots, but full-fledged negotiator agents. This is a new level of autonomy, where AI understands not only the text of a request but also the hidden motives of whoever wrote it.

Of course, the question of ethics arises. If we train a neural network to be a convincing liar in a game, how do we make it absolutely honest in financial reports or legal advice? The line between "strategic maneuver" and outright disinformation is very thin. Google hasn't yet provided direct answers, focusing on technical achievements instead. However, the very presence of Gemini 3 at the top of gaming leaderboards suggests that the model architecture has become flexible enough to adapt to rules on the fly without losing performance.

In the near future, we will see how other market players — OpenAI and Anthropic — will be forced to accept this challenge. The era of static tables with numbers is ending. The time of arenas is coming, where intelligence is proven in action. And if your next personal assistant suspiciously easily convinces you to buy exactly this subscription, remember that it may have simply trained very well at poker at night on Google's servers.

The bottom line: Google is moving the evaluation of AI from the realm of dry knowledge into the realm of social intelligence. Whether Gemini 3 can outbluff a human — that's the question of the year.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…