Not a Model, but a System: How Svoi Built a 7-Layer Fintech Bot Architecture
When an LLM gives a bank customer the wrong account balance or suggests an outdated product, it's not a neural network error — it's a failure in one of the seve
AI-processed from Habr AI; edited by Hamidun News
When a voice assistant tells a bank customer they have a million rubles in their account when they only have a hundred — that's not a neural network problem. It's a system-wide failure.
Architecture, Not Magic
A hybrid architecture for a voice bot in fintech isn't "just one good LLM and you're done." It's seven layers, each with its own job:
- ASR (automatic speech recognition) — what the user said
- NLU (natural language understanding) — what they wanted to do
- Routing (routing) — where to direct it
- API (data retrieval) — facts about the client and their accounts
- Knowledge (knowledge base) — current information about products
- Compliance (rule checking) — is this allowed
- Voice (speech synthesis) — how to answer beautifully
Plus orchestration on top: the LLM decides how to connect all these pieces. When the system works, nobody notices. When one link breaks — everything breaks.
Why a Weak Link Is Stronger Than a Strong Model
Fintech's paradox in one sentence: if the knowledge base is three days outdated, no GPT-5 will save you. If routing can't transfer to an operator when it doesn't understand, the best NLU is useless. If the API returns data with a three-hour delay, the assistant will be giving wrong information day after day. Svoi.ru has seen this in real combat. A customer might complain for a long time about the bot being "stupid," but the problem is that transaction histories aren't updating in real time. A very good LLM will honestly report what it's given as input.
How It All Connects
Imagine this scenario: a customer calls the bank and asks about cashback on rubles. ASR hears correctly. NLU understands that product information is needed. Routing sends it to the knowledge base. But the knowledge was updated a week ago, before the last change in conditions. Voice synthesizes the answer. And the LLM isn't to blame — it faithfully recounted what it was given. That's why in such systems, developers spend 80% of their time not improving the model, but on:
- updating and maintaining knowledge base consistency
- API reliability and speed
- quality and relevance of routing logic
- graceful degradation when a component fails
- testing each layer separately
What This Means
The era when you could pick the coolest model and wait for magic is over. In banks, insurance, brokerages — everywhere an error costs money — the winner won't be the company with the best LLM, but the company with the best engineering. And that's good: it means the door for newcomers stays open if you understand how systems work as a whole.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.