OpenAI's ChatGPT 5.4 beat Claude Opus 4.6 and Gemini 3.1 Pro in a Habr comparison
Habr compared Gemini 3.1 Pro, ChatGPT 5.4, and Claude Opus 4.6 across four everyday scenarios: text generation, PDF summarization, math, and Python…
AI-processed from Habr AI; edited by Hamidun News
On Habr, a major practical review of three flagship models was published: Gemini 3.1 Pro, ChatGPT 5.4, and Claude Opus 4.6. The author tested not abstract benchmarks, but everyday real-world tasks — from writing a story and compressing a PDF to math and a Python application — and by total score unexpectedly ranked ChatGPT as the leader.
How they compared
The test involved four scenarios that users actually encounter with AI every day. First, the models were asked to write a humorous fantasy story in three chapters. Then they were given a PDF with practical work and asked to create a concise but usable summary without losing key information. After this came a block of four math problems, and the final test was developing a desktop application in Python: an engineering calculator with a GUI and an embedded Snake game.
The evaluation logic was as practical as possible. The author evaluated text and code tasks on a three-point scale, while the math stage gave up to four points — one for each correctly solved problem. Additionally, for the first time, he included the cost of each request in rubles in the table. Thanks to this, the comparison was not only about answer quality, but also about the cost of the result. The maximum in such a scheme is 13 points, and it was precisely the combination of points with expenses that became the main criterion for the final choice.
Who won the stages
In the first stage, ChatGPT faltered slightly due to a chapter numbering error and received 2.5 points, while Gemini and Claude took the maximum of 3 points each. In the second round, the picture changed: ChatGPT compressed the PDF better than all and preserved important details, while Gemini and Claude, in the author's opinion, cut the text too aggressively and lost some necessary information. The math block was even for all three, but in programming, nuances appeared again, not in theory but in working results.
- Text generation: Gemini 3.1 Pro — 3 points for 20 rubles, Claude Opus 4.6 — 3 points for 68 rubles, ChatGPT 5.4 — 2.5 points for 25 rubles.
- PDF compression: ChatGPT 5.4 received 3 points for 24 rubles; Gemini 3.1 Pro and Claude Opus 4.6 took 2 points for 16 and 38 rubles respectively.
- Math: all three models solved the problems perfectly, but ChatGPT 5.4 was cheaper — 15 rubles versus 22 for Gemini and 29 for Claude.
- Programming: ChatGPT 5.4 received 3 points for a working calculator and Snake, Gemini 3.1 Pro — 2.5 points due to unsuccessful key capture in the game, Claude Opus 4.6 — 2 points due to an error when dividing by decimal numbers.
"The result is clear — ChatGPT 5.4 won."
Price and compromises
The final table turned out to be telling. ChatGPT 5.4 scored 11.
5 points and spent 112 rubles. Gemini 3.1 Pro finished the test with 10.
5 points and total expenses of 87 rubles, making it the most economical option. Claude Opus 4.6 received 10 points but cost 208 rubles — almost twice as much as ChatGPT and more than twice as much as Gemini.
If looking only at price, Google's leader here; if at the balance of quality and expenses, the advantage is with OpenAI. However, the review itself does not claim to be a universal academic benchmark. The author directly compares models in a narrow set of everyday tasks and in some places relies on personal editorial judgment, especially where it comes to text style or interface convenience.
But that is exactly why the material is useful: it shows not laboratory records, but how models behave in practical work. In this selection, Gemini looks like a rational budget option, Claude — like an expensive and inconsistent one, and ChatGPT — like the most stable compromise.
What this means
If choosing one model for a broad set of everyday tasks, then by this comparison ChatGPT 5.4 is ahead: it's not the best everywhere, but more often delivers the most even result for reasonable money. Gemini 3.1 Pro remains a strong alternative for those who watch their budget closely, while Claude Opus 4.6 after such a test looks like a less favorable choice than before.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.