A Year Later: Qwen3 Still Holds the Price/Quality Crown — LLM Model Battletest

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

Jun 11, 2026. Reading time: 3 min.

LLM battletest results show Qwen3-235B from July 2025 once again leading in price/quality ratio. Over the year, Gemini improved by 40 points, while DeepSeek…

Hamidun News Editorial

AI monitoring · Habr AI

Jun 11, 2026· 2 min

AI-processed from Habr AI; edited by Hamidun News

A Year Later: Qwen3 Still Holds the Price/Quality Crown — LLM Model Battletest — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

I compiled four LLM models into one batch to verify whether the smaller Gemma actually surpassed the larger one in cross-session tests. The results proved far more interesting than expected.

Head-to-Head: Gemma Neither Surpassed the Other

In a fair head-to-head comparison, the unexpected cross-session test result dissipated: both Gemma versions came out even, no difference whatsoever. But this was only the beginning. DeepSeek V4 Flash, which I rated at 83 points, this time delivered 89 — exactly 6 points higher. The model turned out to be underrated, and this became the main finding of the batch test. Overrating one model can lead to underrating the entire hierarchy. Therefore, fair head-to-head comparisons in a single context remain the gold standard.

Qwen Has Held the Crown for a Year

Meanwhile, Qwen3-235B-A22B-2507 (released July 21, 2025) once again took first place in price/quality ratio. This was a July checkpoint — almost exactly a year ago. And it still hasn't been displaced by competitors. Much has happened over this year. Gemini jumped from 57 to 97 points — a 40-point gain. I re-tested DeepSeek three times, each with new results. New contenders appeared. But Qwen? Simply holds the throne.

Gemini: +40 points over the year
DeepSeek V4 Flash: underrated by 6 points
Qwen3: still best for price/quality
MiniMax: generated buzz, solid in tests, but not revolutionary
Eight new June models: did not displace the leader

New Metrics and MiniMax's Buzz

A new criterion was added to the rating update — generation speed. It turned out that speed and quality don't always go hand in hand. A model can be fast but slower in learning on current data, or vice versa. MiniMax deserves separate mention. It's truly praised by everyone, and in terms of capabilities it's close to Opus. But there was very active hype around it. In fair tests, it shows results worthy of attention, but not revolutionary enough to rewrite the hierarchy.

What Does This Mean

If you're choosing between quality and price, Qwen3-235B remains the best choice for most tasks. Other models are more specialized: Gemini for multimodality, DeepSeek for experimentation, MiniMax for those willing to pay more.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation