Habr AI→ original

Krok Shows How It Built an Internal RAG Assistant for Corporate Data

Krok shared a case study of an internal RAG assistant for working with corporate knowledge in a closed environment. The company rejected external services…

AI-processed from Habr AI; edited by Hamidun News
Krok Shows How It Built an Internal RAG Assistant for Corporate Data
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

Croc explained how it built an internal RAG assistant for working with corporate knowledge in a closed loop. Instead of a 'chat for chat's sake', the company created a tool that reduces the time spent searching for relevant fragments in documents, wikis, and internal portals.

Why they chose RAG

The team started with a simple problem: while built-in search exists in virtually all large corporate systems, it doesn't make employees' lives easier. Information is scattered across multiple repositories, each with its own rules, structure, and logic for returning results. As a result, people remember the meaning of a document but forget the exact wording, filename, or location. Standard search relies on word matching rather than context, and this is where it begins to fail.

They also tested a direct approach with GPT and document uploads, but it didn't work for large datasets. Several limitations came into play: context window size, answer instability, and the risk of exposing sensitive documents outside the company's infrastructure. This is why Croc chose the RAG approach: first find relevant fragments from internal sources, then pass them to a language model to assemble the answer.

"We needed an AI tool that serves a purpose, not AI for trend's sake—a managed corporate assistant."

RAG was chosen in this project for four practical reasons:

  • Documents remain within the company's infrastructure;
  • Only the necessary context is passed to the model, not the entire dataset;
  • Indexing and retrieval can be controlled separately;
  • The risk of hallucinations is lower because the answer is grounded in real sources.

How the system was organized

The solution is based on two separate circuits. The first handles assistant creation: users can add files, links to portals, or knowledge base spaces through the corporate messenger or a pilot web interface. The assistant management system then checks access rights, runs the necessary parsers, prepares the data, and only then sends it to the RAG core for indexing. Once the index is ready, the employee receives a notification and can begin a dialogue.

The second circuit is the dialogue circuit. User requests go through the corporate LLM bus, and the RAG engine pre-selects relevant context. Access rights are checked not just once during loading, but with every assistant query. Due to the complexity of ACLs across different systems, the team decided to move away from shared assistants in some scenarios and switched to personal assistants. This is less convenient, but it reduces the risk of an employee seeing data they shouldn't have access to.

They had to build connectors and preprocessing for nearly every data source type. The corporate portal, knowledge base, personal files, and wiki pages were too different to pass directly to a single RAG core out of the box. So they extracted data cleaning, normalization, and preparation into separate services. For the main entry point, they chose an internal messenger based on Express: Telegram initially seemed convenient, but using an external service for sensitive information was immediately ruled out.

Where problems arose

The most painful challenges came not from the interface, but from data and processes. Wikis with complex markup required extensive manual cleaning. Tables and numerical data produced unstable answers. PDFs with scans and graphics broke basic parsing. Visual structures like organizational charts were understood less well by the model than text and it could confuse relationships between departments and managers.

Additionally, users expected RAG to provide comprehensive search across all matches, though the approach itself tends to rank the most probable context rather than guarantee a complete list of occurrences. Vendor collaboration was equally challenging. Croc tested several platforms on their own document sets and typical queries, then spent nearly a year refining the solution with their supplier. Updates became a problem—they could dramatically change answer quality: one version dropped the metric from 85% to 70%.

Due to the opaque relationship between RAG and the embedded LLM, the team requested a separate interface between them so they could independently choose the model and manage further context processing. For quality control, they introduced benchmarks, reference questions, regular checks, and even a separate LLM judge that compares actual answers to expected ones.

What this means

Croc's case illustrates well that corporate AI today is not so much about choosing a model as it is about engineering around data, access, and testing. RAG by itself doesn't solve the problem if there are no connectors, collection isolation, ACL control, and clear quality metrics. But when all of this is brought together in one system, an internal assistant can truly eliminate hours of routine searching and make working with corporate knowledge significantly faster.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…