Habr AI→ original

How a Russian IT Company Turned Corporate Knowledge into a Working RAG System

Tsifra, a company operating in industrial digitalization, published a detailed breakdown of its corporate RAG system. Instead of costly fine-tuning of language

AI-processed from Habr AI; edited by Hamidun News
How a Russian IT Company Turned Corporate Knowledge into a Working RAG System
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

The challenge of corporate knowledge management is familiar to any technology company that has grown beyond a hundred employees. Documentation is scattered across dozens of systems, expertise is locked in the minds of key people, and a new engineer spends weeks finding an answer to a question someone already solved six months ago. Engineers at "Tsifra" — a Russian developer of enterprise solutions for industry — decided to tackle this problem systematically and built a full-fledged RAG pipeline for internal use.

RAG, or Retrieval-Augmented Generation, has long ceased to be exotic in the corporate AI world. The idea is simple: instead of relying on a language model to "remember" the needed information, the system first finds relevant documents in a knowledge base, then passes them as context for generating an answer. However, there is a vast chasm of engineering decisions between a beautiful architectural diagram and a working product, and these details are what make Tsifra's case truly valuable for the industry.

The system architecture, described by leading training center engineer Dmitri Omarov and his colleague Fyodor Arefyev, is built on several key principles. The first and perhaps most important is a conscious refusal to fine-tune language models. Fine-tuning a corporate LLM sounds attractive in presentations, but in practice it is an expensive process that requires constant updates whenever documentation changes.

The team bet on a dynamic knowledge base: documents are indexed, converted into vector representations, and stored locally. When an employee asks a question, the system finds the most relevant fragments through vector search, then passes the results through a reranking stage — additional ranking that filters out noise and improves retrieval accuracy. Only then is the collected context sent to a cloud language model for generating the final answer.

The approach to information security deserves special attention — a painful issue for any company working with industrial customers. Sending internal documents to a cloud AI service without filtering is a direct path to data leakage. Tsifra engineers implemented a local cleaning layer: before context leaves the company perimeter, sensitive information is automatically removed from it. This is an elegant solution that allows leveraging the power of cloud LLMs without compromises in security. Essentially, the company gets the best of both worlds: local data control and the generation quality provided only by major cloud models.

Combating hallucinations is another front where the team achieved notable results. Language models tend to generate plausible-sounding but factually incorrect information, and in a corporate context this is unacceptable. An incorrect reference to a regulation or a flawed technical recommendation can lead to real consequences on the production floor. The solution turned out to be partly engineering, partly methodological: a carefully designed system prompt strictly constrains the model, requiring it to rely exclusively on the provided context and accompany each answer with citations to primary sources. If the knowledge base contains no information for an answer, the model must honestly admit this rather than fabricate.

This case is important not so much for specific technical solutions, but for the overall approach. Russian companies operate under specific conditions: access to leading cloud AI platforms is limited, regulatory requirements for data processing are strict, and budgets for AI infrastructure are far from unlimited. In this reality, RAG systems with a local search layer and controlled interaction with the cloud become, in essence, the standard architecture. They allow balancing between quality, cost, and security — three parameters that in corporate AI are almost always in conflict with each other.

Tsifra's experience also demonstrates a broader trend: corporate AI in 2026 is no longer about experimenting with chatbots, but about knowledge management infrastructure. Companies that learn to make their collective expertise accessible through intelligent search will gain a measurable competitive advantage. Time to onboard new employees shrinks, solving already-solved problems becomes a thing of the past, and critical information stops being held hostage by individual experts. Essentially, a RAG system transforms corporate memory from a passive archive into an active working tool — and that is where its true value lies.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…