How to Control LLM in a Role-Playing Game: Beyond The Verge Architecture

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

May 29, 2026. Reading time: 3 min.

Beyond The Verge tamed LLM problems in RPGs: model drift, amnesia, and rule violations. The solution is radical: an authoritarian FastAPI backend with…

Hamidun News Editorial

AI monitoring · Habr AI

May 29, 2026· 3 min

AI-processed from Habr AI; edited by Hamidun News

How to Control LLM in a Role-Playing Game: Beyond The Verge Architecture — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

Beyond The Verge — a fully Russian-language text RPG based on LLM — faced a classic problem: models forget context, violate game rules, and add items out of thin air. After 30 turns, the game turns into incoherent chat. Developers chose not to try to 'tame' the model, but to remove its control over mechanics.

Why LLM Isn't Suitable for Mechanics

LLM is generative by nature, while RPG requires determinism. The model only remembers window context, cannot distinguish its imagination from game state, and can spontaneously change the plot or violate logic. If allowed to manage inventory, it will easily lose a sword, forget constraints, or add an item the player didn't take.

Architecture: LLM as Narrator Only

Beyond The Verge divided responsibilities. All game mechanics are deterministic logic on the backend:

Inventory — a row in PostgreSQL with item IDs, weight, and properties
Map — a graph of vertices (locations) and edges (transitions between them)
Character state — a vector in pgvector for fast context search
Combat system — damage, defense, and crit formulas — all calculated
Quests — finite state machines with fixed states

The LLM receives a snapshot of the game state in text form and generates only description: "You entered a dark forest. You hear bird sounds. In inventory: dagger, mana potion (30%). Ahead is a goblin, weak."

The player's action is parsed, validated by backend logic (can the character do this?), the result is calculated, then the LLM describes the consequences.

FastAPI + PostgreSQL + pgvector

The stack is simple but effective: FastAPI processes the player's turn, PostgreSQL stores state (inventory, NPCs, quests), pgvector finds relevant context for the LLM (character memories, location atmosphere), and Flutter Web serves as the interface.

When the player moves, the backend updates the position in the map graph, finds the best descriptions in pgvector, and collects visible objects. The LLM receives compact context and generates text in under 1 second. There is no struggle with model memory, no amnesia.

Scalability through Versioning

With 1000 simultaneous game sessions, state can conflict: two moves at the same time — who wins? Solution: optimistic locking with state versioning. Each state has a version number; if a conflict is detected during a move, the client resynchronizes and the move is replayed. Race conditions are excluded, and the system scales linearly.

What This Means

LLM is a text generation tool, not a control mechanism. For reliable systems, logic, state, and mechanics should live in code, not in the model. LLM translates events into vivid description. This is a pattern for all AI systems with deterministic state: browser agents, complex simulations, games. Separating the controllable from the generative is the path to reliability and scalability.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation