Habr AI→ original

Runity showed how it is building an enterprise RAG assistant for Confluence and GitLab

Runity shared details of an enterprise RAG assistant that combines search across Confluence and GitLab, checks access to each document, and does not send…

AI-processed from Habr AI; edited by Hamidun News
Runity showed how it is building an enterprise RAG assistant for Confluence and GitLab
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

Runiti shared how it transforms an internal prototype into a corporate RAG assistant for working with documentation and code. The system searches both Confluence and GitLab simultaneously, respects access rights, and operates in a closed loop without sending corporate data to external services.

From Idea to Implementation

The project grew out of a very practical pain point. At the beginning of 2025, the team needed to simultaneously understand the old Rucentra website and the rebranding of Reg.ru: figure out what was already implemented, where the current documentation was, and which code pieces were responsible for specific functions.

Manual searching took hours: Confluence contained several versions of the same document, and in GitLab they had to wade through branches and legacy code written in outdated JavaScript. The first local neural networks that the information security team approved already helped accelerate the work. According to the team, technical specifications for frontend developers were prepared in a few days instead of lengthy manual markup and analysis.

After that, the company decided not to limit itself to isolated experiments and built a separate product that could be integrated into the daily workflow of developers and architects. The prototype, which started as a personal initiative, later became part of Runiti's Hybrid Intelligence Center — an internal division focused on AI pilots and applied scenarios with measurable results.

Access and Security

The main question from the security team was predictable: who exactly would see corporate documents through such an assistant. The team solved this not with a separate policy on top of the model, but at the architecture level. The bot doesn't store a permissions matrix internally.

Users themselves add their personal Confluence and GitLab tokens, after which the system checks access to each found document via API. If there's no access, that fragment simply doesn't enter the model's context. Essentially, the decision about access is made not by the LLM, but by the code.

This reduces the risk of data leaks and keeps corporate data within their own closed loop. The tradeoff here is one: synchronous access checks slow down the response. But the team claims that even in this form, a task that previously took several hours now fits within five to seven minutes.

After additional refinements, logging, and interface fixes, the project received security approval and went into deployment.

"If a process can be described as a sequence of actions—it can be automated."

Stack and Scenarios

Inside, the system works according to a classic RAG scheme: the query is converted into an embedding, then Qdrant selects semantically similar documents from Confluence and GitLab, after which the security layer filters out everything unnecessary, and the model generates an answer with links to specific sources. This approach was chosen instead of fine-tuning: the team values current context at the time of the query more than retraining the model on corporate data. The stack uses Python, Temporal, Qdrant, PostgreSQL, Next.

js, LangGraph, and locally deployed Qwen models, while data in the vector database is currently updated through nightly rebuilds. Instead of one universal assistant, Runiti created four specialized modes. This approach didn't come from abstract architecture but from requests of different roles within the company: developers need a code assistant, architects need quick entry into the current landscape, and managers need a way to automatically collect a tech radar of the stack and dependencies in repositories.

This also simplifies product development: individual scenarios are easier to test, measure, and refine without trying to solve every task with a single prompt.

  • a general chatbot for questions about internal documents and quick project onboarding;
  • a tech radar agent that goes through repositories and gathers a picture of languages and libraries;
  • an agent for architectural planning that helps understand the current landscape before launching a new project;
  • a programming partner that knows the internal codebase and team requirements.

The cost of such a solution is far from being a "toy for experiments." For a corporate mode with multiple users, the team estimates a need of approximately four A100-level GPUs with 24 GB of memory, which costs 160–200 thousand rubles per month in compute alone. If you don't have your own GPUs, the entry threshold for a small local setup starts at roughly 500 thousand rubles, and development still requires a backend, frontend, ML, and data engineer.

What This Means

Runiti's case shows that corporate AI assistants are quickly shifting from the idea of "just bolt on a chatbot" to full-fledged internal products with RAG, access verification, and their own infrastructure. In practice, what wins is not the loudest AI rhetoric, but a combination of solid search, secure access, current data, and scenarios that really save the team hours of work. For the market, this is yet another signal: corporate AI is increasingly turning from a pilot into an engineering product with clear costs and zones of responsibility.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…