Russian government proposes training AI on copyrighted materials without consent
The Russian government has built a controversial provision into its draft AI law: developers may be allowed to train models on books, articles, films, and…
AI-processed from CNews AI; edited by Hamidun News
Russia's government has included in the AI law draft a provision that could dramatically change the rules of the game for Russian model developers. The Cabinet of Ministers proposes allowing neural networks to be trained on copyrighted materials without the consent of copyright holders, but only within the framework of the not-yet-adopted bill.
What authorities propose
The essence of the initiative is that companies will be able to use for AI training articles, books, films, images and other copyrighted works without separate permission from the author. An important caveat: the end-user of the service should not see the original content of such materials. That is, the state is trying to legalize the process of training the model itself, without opening the way for direct distribution of others' content in the product interface.
For the market, this is one of the most sensitive regulatory amendments in recent times. The draft law also contains a second fundamental provision: they want to fix the rights to the result of AI work with the user if he not only entered a prompt, but actually made a creative contribution — formulated the prompt, processed the response and refined the result. This is an attempt to determine in advance who owns the value at the output, while the market argues about where automatic generation ends and human authorship begins. In other words, lawmakers are trying to establish the role of the human in the final result.
Where the boundaries lie
According to the publication, we are talking primarily about those data arrays that are especially important for training powerful models, but poorly accessible to Russian teams: scientific texts, educational materials, archival documents. The logic of the authorities is simple: the domestic data market is smaller than that of global players, and without expanding access to content, Russian services will lose in quality and speed of development. This is especially true for teams building large language and multimodal models.
- Articles, books, films and images will be allowed for training without separate author consent
- The user should not see the protected original content
- Rights to the final result are planned to be given to the person, not the model
- Personal data, private correspondence and tax information are not subject to the relaxation
- The final version of the law has not yet been published
At the same time, the relaxation does not look unconditional. Government representatives have already emphasized that there is no final version of the document yet, and any technology should be applied in compliance with the rights and interests of citizens. This means that around the mechanics of data access, exceptions and future liability, there will still be negotiation: between developers, copyright holders, industry associations and the state. The most heated disputes will almost certainly begin at the stage of specific formulations and exceptions.
Why this is controversial
From a legal perspective, the main conflict does not go away. Lawyers remind: the training of a model itself can be interpreted as analysis, not reproduction of a work, but storing materials for subsequent training already requires author consent, if the law does not introduce a separate exception. This is why the current initiative is important not as a technical detail, but as an attempt to rewrite the basic fork between the interests of AI companies and content owners. And it is on this boundary that the most expensive legal disputes usually arise.
On the global market, this dispute has already escalated to high-profile lawsuits against Anthropic and OpenAI, and author associations in Europe and the US are increasingly attacking generative model developers. In Russia, however, authorities are rather seeking a compromise in favor of industry growth, believing that overly strict restrictions will leave local teams without data and without a chance to compete with American and Chinese platforms. The authorities have already voiced this logic before, when talking about AI regulation in general.
"It is important not to 'suffocate' the technology with rules."
But for copyright holders, such a formula sounds alarming: if a model is trained on works without the author's consent, and the user then gets rights to the result, the risk of blurring authorship boundaries only grows. A separate question is how to prove a violation if the user himself does not know what data the model was trained on and where specific fragments of the answer came from. This is where, probably, the most heated disputes will emerge after the publication of the final version of the document. And this is where it will be decided how workable the new provision will be in practice.
What it means
If the provision makes it to the final law, the Russian AI market will get broader data access and will likely accelerate the training of local models. But at the same time, tension around copyright will grow: the easier it is to train neural networks on others' content, the more acute the question becomes of where analysis ends and use of someone else's work begins. For startups, this is a chance to accelerate, and for authors and publishers, it is a reason to prepare for a new series of disputes about the boundaries of what is permissible.
Need AI working inside your business — not just in your newsfeed?
I build production AI for companies — custom CRM, internal tools, autonomous agents, workflow automation. Owned by you, shaped to your process, no per-seat tax. Built by Zhemal Khamidun, CPO of AlpinaGPT (AI platform, 6,000+ users).
The AI world, distilled — once a week
Seven stories that actually mattered, hand-picked. No noise, no reposts, no press releases.
Done! Check your inbox for a confirmation.