Habr AI→ original

Fintech group "Svoi" explains how to make LLM-agents cheaper and more accurate in code

The fintech group "Svoi" released a guide on working with LLM-agents in development. The key insight: neural networks cannot be used as "improved search"…

AI-processed from Habr AI; edited by Hamidun News
Fintech group "Svoi" explains how to make LLM-agents cheaper and more accurate in code
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

LLM assistants have already become a working tool for developers, but the real return on them depends less on the model itself than on how the context around it is structured. The fintech group "Svoy" draws attention to a simple problem: many engineers still work with Cursor, Windsurf, and similar systems as if they were simply a more convenient search. Because of this, the agent receives tasks that are too vague, loses focus, wastes extra tokens, and produces code that looks plausible but integrates poorly into the project.

In a published guide, developers propose viewing LLM not as a universal advisor, but as an isolated computing kernel. Such a system has no "project understanding" by default: it relies only on the context explicitly passed to it and the rules by which that context is assembled. If the architecture of prompts and file environment is not well thought out, the model starts mixing different levels of abstraction, confusing dependencies, and making mistakes even in places where the task seems routine.

The authors draw on experience implementing AI tools in fintech projects, where the cost of inaccuracy is especially high. For teams working with business-critical code, the problem lies not only in answer quality but also in the predictability of agent behavior. It is important that it does not merely sometimes write successful fragments, but consistently perform well-understood operations: analyze a code section, suggest safe edits, stay within its assigned role, and not waste budget on meaningless iterations.

This is why the focus shifts from the magic of the model to engineering discipline around it. The key thesis of the tutorial is that LLM effectiveness is directly tied to context architecture. This means that the task needs to be broken down, instructions limited, and data sources structured so the agent sees only what is necessary at each moment.

The less noise in the environment, the higher the code accuracy and the lower the cost of re-requests. This approach is especially important in environments where the agent has access to a large repository: without context filtering, it starts to "spread" across the project and loses the ability to confidently solve local tasks. From an economic perspective, the idea is quite grounded.

The main expenses for AI tools arise not only from the price of the model itself, but also from a poorly organized workflow cycle: long prompts, unnecessary files in context, repeated attempts to fix failed results, and constant returns to already-discussed fragments. When an agent is assigned a clear role, its area of responsibility is limited, and results are checked against clear criteria, the team saves not only tokens but also developer time, which is usually spent on manual review and task reformulation. The separate value of the material is that it shifts the conversation about AI assistants from the realm of general promises into the realm of practice.

Instead of the idea "give the model the whole project and let it figure it out," a more mature scenario is proposed: build clear boundaries, roles, sequences of actions, and verification mechanisms around the agent. Essentially, this is about turning a neural network into a managed development tool, not an improvising co-author. For companies already paying for AI assistants, this is an important shift: cost reduction here is achieved not by abandoning models, but by organizing their work more precisely.

From the logic of this approach follows another practical conclusion: the better the team describes the task, artifacts, and readiness criteria, the less likely LLM will compensate for gaps with guesses. For engineering processes this is especially important, because the model easily creates convincing but invalid code. Therefore, mature work with agents gradually becomes similar to designing a pipeline: first the input data is determined, then constraints, then execution steps, and only after that—the freedom of generation.

This is a good signal for the development market as a whole. As AI assistants become the norm, competitive advantage will be determined not only by choice of model, but also by how well the team can design context, constraints, and scenarios of interaction with the agent. In other words, the next stage of maturity is not "use LLM" but "build a working operational environment for it."

This is precisely what distinguishes random code generation from predictable engineering practice.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…