Not a Model, but a System: How Svoi Built a 7-Layer Fintech Bot Architecture

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

2026-05-21. Reading time: 2 min.

When an LLM gives a bank customer the wrong account balance or suggests an outdated product, it's not a neural network error — it's a failure in one of the seve

Hamidun News Editorial

AI monitoring · Habr AI

2026-05-21· 2 min

AI-processed from Habr AI; edited by Hamidun News

Not a Model, but a System: How Svoi Built a 7-Layer Fintech Bot Architecture — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

When a voice assistant tells a bank customer they have a million rubles in their account when they only have a hundred — that's not a neural network problem. It's a system-wide failure.

Architecture, Not Magic

A hybrid architecture for a voice bot in fintech isn't "just one good LLM and you're done." It's seven layers, each with its own job:

ASR (automatic speech recognition) — what the user said
NLU (natural language understanding) — what they wanted to do
Routing (routing) — where to direct it
API (data retrieval) — facts about the client and their accounts
Knowledge (knowledge base) — current information about products
Compliance (rule checking) — is this allowed
Voice (speech synthesis) — how to answer beautifully

Plus orchestration on top: the LLM decides how to connect all these pieces. When the system works, nobody notices. When one link breaks — everything breaks.

Why a Weak Link Is Stronger Than a Strong Model

Fintech's paradox in one sentence: if the knowledge base is three days outdated, no GPT-5 will save you. If routing can't transfer to an operator when it doesn't understand, the best NLU is useless. If the API returns data with a three-hour delay, the assistant will be giving wrong information day after day. Svoi.ru has seen this in real combat. A customer might complain for a long time about the bot being "stupid," but the problem is that transaction histories aren't updating in real time. A very good LLM will honestly report what it's given as input.

How It All Connects

Imagine this scenario: a customer calls the bank and asks about cashback on rubles. ASR hears correctly. NLU understands that product information is needed. Routing sends it to the knowledge base. But the knowledge was updated a week ago, before the last change in conditions. Voice synthesizes the answer. And the LLM isn't to blame — it faithfully recounted what it was given. That's why in such systems, developers spend 80% of their time not improving the model, but on:

updating and maintaining knowledge base consistency
API reliability and speed
quality and relevance of routing logic
graceful degradation when a component fails
testing each layer separately

What This Means

The era when you could pick the coolest model and wait for magic is over. In banks, insurance, brokerages — everywhere an error costs money — the winner won't be the company with the best LLM, but the company with the best engineering. And that's good: it means the door for newcomers stays open if you understand how systems work as a whole.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation