Russian in ChatGPT costs 2x more: tokenization is to blame
Russian text in ChatGPT and other cloud-based LLMs doubles in cost compared with English. The reason is the way neural networks split text into tokens: an Engli

When you send a request to ChatGPT or another cloud-based neural network, it doesn't work with letters and words directly. The text is first broken down into tokens — small chunks of meaning that the model can process. This breakdown determines the cost of the request, the speed of the response, and how much information can fit in the context window at once.
How tokenization works
Tokenization is the process by which different neural networks slice text differently. English text is sliced very efficiently: a word usually takes one or two tokens. The word "contract" is always one token. A 1000-word English text will require approximately 1200-1500 tokens.
Russian is not so fortunate: the same content requires 2-3 times more chunks. The Russian word "разработка" requires two or three tokens. "Программирование" requires three or four. And an adjective like "искусственный" can take four or five tokens. A 1000-word Russian text will require 2500-3500 tokens.
This happens because English was used far more intensively in training modern large language models than Russian. Its vocabulary is better represented in the token dictionary that model creators assembled from massive amounts of English-language content. The Cyrillic script remains foreign to neural networks.
What it costs in practice
Due to the inequality in tokenization, Russian text in cloud services like OpenAI costs approximately 2 times more expensive than English for the same amount of actual information. If you pay $1 for processing 1000 tokens of English text, then Russian will cost $2.
It's easiest to notice this when working on large projects: localizing an application into Russian, translating documentation, or running a chatbot in Russian will cost twice as much as the same services for an English-language user.
But high cost is only the beginning of the problems. Processing Russian text is noticeably slower because the model needs to process more tokens. When there are more tokens, the response takes longer. And the context window — that very volume of memory where the model can hold information — becomes half as small in terms of actual content. If a model has a context window of 128 thousand tokens, then in Russian you can only fit half that amount of actual Russian information.
Who it hits especially hard
- Russian-speaking developers using AI to work with documentation and code
- Companies processing large volumes of Russian text (translations, chatbots, analytics)
- Russian-language startups building products based on LLMs who cannot afford OpenAI expenses
- Researchers working with the Russian language and needing deep analysis through neural networks
- Authors and publishers who want to use AI for editing and rewriting texts
How to measure on your own data
The article author recommends checking the actual token ratio for your specific texts: take a sample in English and Russian, count tokens through the OpenAI API, and compare. This will take five minutes and show the exact cost of your case.
What it means
The inequality in tokenization is a hidden tax on the Russian language in the era of large language models. This is not an error by the developers, but a natural consequence of how these models were built: on English content from the first generation of the internet. For the Russian-speaking community, this means accepting reality: either pay more and get slower results, or look for alternatives that were trained with better Cyrillic support.