Rerayt-Zavod showed the limit of AI rewriting: rules can be transferred, editorial voice cannot
The Rerayt-Zavod team explained why AI news rewriting can be factually accurate and structurally correct, yet still fail to sound like a specific media…
AI-processed from Habr AI; edited by Hamidun News
The "Rewrite-Factory" project, which automates news rewriting for regional media, has described the main limitation of its approach: AI already knows how to reproduce text structure and formal editorial rules, but doesn't always grasp the voice of a specific publication. Using examples from Fontanka materials, the developers showed that the decisive factor is not the template, but precise word choice.
Test on Fontanka
The developers trained the system on Fontanka's style and ran several texts about the same event through it — the detention of a 16-year-old in Ufa on suspicion of preparing a terrorist attack. The generated rewrite turned out to be grammatically correct, logical, and factually accurate: the lede was assembled correctly, attribution was in place, key details weren't lost. But against the backdrop of Fontanka's actual text, it quickly became clear that the model writes like news in general, not like a specific publication.
The key difference came down to one word. The original said "подросток" (teenager), while Fontanka's publication used "мальчик" (boy) and "школьник" (schoolboy). This choice doesn't change the factual content of the news, but it does change its tone: alongside the mention of recruitment and terrorism, a childlike image appears, which heightens tension without direct authorial judgment.
A neutral term conveys the fact, while a more precise editorial word also conveys authorial distance, rhythm, and the emotional weight of the phrase.
"Boy" instead of "teenager" — that's editorial intuition.
Where the rules break
The project uses an aspect-based approach to style: instead of one large prompt, the model receives a set of characteristics of a specific media outlet — structure, tone, vocabulary, headlines, and other parameters. This approach works well where style can be described as a rule. For example, one can establish that the lede starts with a fact, attribution is given once, sentences are on average short, and the official toponym "Saint Petersburg" is better replaced with "Petersburg".
All this can be measured, verified, and fairly reliably reproduced on new texts. The problem begins where style consists not of prohibitions and instructions, but of micro-choices in a specific context. A formula like "neutral-informational tone with elements of colloquiality" sounds plausible, but tells almost nothing about which exact word an editor would choose in a sensitive story.
The same applies to the construction "according to the investigation": it's not just a source, but a way to embed distance into the phrase itself. Such decisions don't reduce to a stable set of rules, because in another situation the same publication might write much more dryly.
What they're fixing next
The developers don't consider this a bug in the narrow sense. Rather, it's a matter of the method's own limitations: structure is conveyed through instructions, while voice is usually conveyed through examples. That's why the product now strengthens not abstract rules, but the context around generation. The logic is simple: a model better imitates an observed editorial technique it has seen than it follows a verbal description of subtle intonation that can't be reliably formalized for all cases. In practice, this shifts the focus of work from prompting to selecting relevant examples.
- The number of examples to imitate increases from 3 to 10–15.
- Examples are selected by story type: crime to crime, emergency to emergency.
- The model is additionally checked for compliance with explicit prohibitions from the style guide.
- The agent verifies not only factual correctness but also the completeness of fact transfer in the rewrite.
In parallel, the team is refining the MVP positioning: the system should accurately reproduce structure and formal style features, while voice is only approximate. This is a more honest framework for newsrooms that need fast, workable rewrites without promises of complete indistinguishability from a live author. According to the team's assessment, for most regional media this may already be sufficient, because their style differences are usually weaker than Fontanka's. In other words, the product promises text discipline and speed, not the magic of complete alignment with a specific publication.
What this means
The story of "boy" vs. "teenager" shows an important boundary for editorial AI tools. They are already capable of saving time on routine work and quite accurately repeating text form, but subtle intonational decisions so far remain a zone of human editing. For news products, this means something simple: automated rewriting works if you promise speed and text discipline, not complete reproduction of a specific media outlet's voice. It's on this distinction that realistic expectations for newsroom automation must now be built.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.