OpenAI, Qwen and GigaChat: Why Choosing AI Models Is Getting Harder for Russian Business
Russian businesses are increasingly running into an unpleasant fork in the road: Western LLMs are becoming less accessible, fully local models are too…
AI-processed from Habr AI; edited by Hamidun News
The Russian AI market is entering a phase where model choice is no longer just a matter of answer quality. For companies, it's now a combination of three factors: the availability of Western services, data requirements, and the cost of local infrastructure.
How the choice narrows
The author describes a situation where Western models like OpenAI and Anthropic are becoming increasingly inaccessible to Russian business not only technically but also legally. Geoblocking and IP restrictions already work for some vendors, and in regulated industries, even formally permissible access through proxy doesn't solve much. If a customer's name, phone number, or voice appears in a request to an external API, it looks like cross-border transfer of personal data and runs into the requirements of Federal Law 152-FZ.
This puts AI agents for support, sales, and contact centers in a zone of increased risk. Through these models pass not abstract text, but real user data. Against this backdrop, demand is growing inside Russia for "sovereign" solutions, but this word often hides not proprietary models, but adapted versions of foreign open-source systems.
And this is where the main compromise begins: the higher the formal independence, the heavier the economics.
Three working scenarios
The market has essentially settled on three approaches. The first is to build a base model from scratch, as Sber does with the GigaChat family. The second is to take a strong open model, most often from the Qwen family, and fine-tune it on Russian corpus and domain data, as Yandex, T-Bank, and Avito do. The third is to continue using Western APIs through the gray zone, if the business is willing to accept legal risk.
- GigaChat — maximum control and locality, but very expensive training and inference.
- Qwen after fine-tuning — noticeably cheaper and faster to launch, but sovereignty here is conditional.
- OpenAI and Anthropic — strong quality and clear economics, but access is becoming increasingly unstable.
- Hybrid schemes — a compromise for mid-market business: start in the cloud, then migrate to your own contour.
The problem is that each path has costs that cannot be ignored. Training from scratch requires tens or even hundreds of millions of dollars, a large volume of data, and scarce GPUs of the H100 or H200 level. Fine-tuning Qwen looks more realistic, but the base architecture and weights remain Chinese. From the perspective of strict regulatory logic, this is not complete independence, but a carefully localized compromise.
Where the money is lost
The most painful argument in the article — not model quality, but inference pricing. According to the author's calculations on their own agent platform, a minute of work on a comparable OpenAI model costs less than 1 ruble, while a minute on GigaChat-Max costs around 80 rubles. For voice agents and contact centers, this is a difference not in percentage points, but almost two orders of magnitude. In such a cost model, you can make a good product technically, but you can't justify it economically.
"A fully Russian solution is insanely expensive"
An additional blow — infrastructure. A server capable of servicing around a thousand simultaneous agent sessions, the author estimates at approximately 55 million rubles. Then another trap kicks in: to keep the token relatively cheap, GPUs need to be loaded at 80-90%. With small and uneven demand, this is difficult. Equipment sits idle, and costs for electricity, maintenance, and depreciation don't go away. That's why AI pays off first and foremost where there's expensive human labor and constant load: support, contact centers, legal functions.
What this means
For product teams, the conclusion is quite harsh: building your entire architecture on a single provider is already dangerous. If a company works with Russian-language LLMs, it needs a model-agnostic scheme with rapid switching between OpenAI, GigaChat, Qwen-like solutions, and a local contour. Otherwise, any new round of blockages, price changes, or data requirements quickly turns a technical choice into a business problem.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.
The AI world, distilled — once a week
Seven stories that actually mattered, hand-picked. No noise, no reposts, no press releases.
Done! Check your inbox for a confirmation.