How a single system instruction turns an LLM into a reliable tool: tests on Qwen and DeepSeek
LLM hallucinations are not a death sentence. A single system prompt can transform a model from a 'confident liar' into a reliable working tool. Tests on Qwen…
AI-processed from Habr AI; edited by Hamidun News
Large language models lie beautifully. Not because they're evil — simply because they're trained to continue text, not to tell the truth. Where a model lacks the necessary data, it generates something plausible and delivers it with expert certainty.
For applied tasks — corporate assistants, analytical tools, decision-support systems — such behavior is unacceptable. An error delivered with confidence is worse than an error with a caveat. The author of an article on Habr proposed a simple but effective metaphor: LLMs need an exoskeleton.
Not fine-tuning, not an RLHF round, not expensive training — a single system instruction that sets the model strict behavioral rules in situations of uncertainty. Tests were conducted on two of the most popular open-source models with strong Russian language support: Qwen (series from Alibaba) and DeepSeek — both are actively used in Russian products precisely because of their accessibility and quality. The essence of the "exoskeleton" is to prevent the model from being overconfident where it is uncertain.
The system instruction prescribes several key rules. First: explicitly acknowledge uncertainty — don't pass over it in silence, but directly say "I don't know" or "I don't have sufficient data". Second: clarify the request if it's ambiguous, instead of choosing one interpretation and answering that.
Third: clearly distinguish between facts the model is confident in and those it merely assumes. Fourth: refuse to answer in areas where the risk of error is high and there's no way to verify the information within the model itself. In theory, this sounds trivial.
In practice — it works. After adding the instruction, Qwen and DeepSeek began far more frequently acknowledging the boundaries of their knowledge: in test scenarios with intentionally insufficient or contradictory context, models stopped "making things up" and started requesting clarifications or explicitly marking uncertainty. The level of confident hallucinations in these scenarios dropped noticeably.
Why isn't this obvious? Because by default, LLMs are trained to give a complete, confident answer — precisely for this they received high marks in RLHF. A human evaluator instinctively prefers elaborate, confident text to a short "I don't know".
The model learned this preference. As a result, it has built-in behavior directly opposite to what's needed in real production, where the cost of an error is measured in reputation or money. A system instruction is a way to rewrite this behavior without changing the model's weights.
Essentially, we impose epistemological humility on the model from the outside. Hence the exoskeleton metaphor: the model itself doesn't change internally, but around it emerges a rigid behavioral structure that directs reactions in the right direction. An important nuance: the instruction must be concrete, not declarative.
"Be accurate and honest" doesn't work — the model already considers itself accurate and honest. What works are specific situations: if the request lacks sufficient context — ask a clarifying question; if you're not confident in a fact — explicitly indicate this and explain why; if the question falls outside your data — say so directly. Each rule describes a specific trigger and a specific action in response to it.
Developers often fear that restrictions will reduce the model's usefulness. The tests showed no such effect. In scenarios with sufficient context, models performed just as well as without the instruction.
The restriction only kicked in where data truly was lacking — exactly those cases where the model used to hallucinate. For teams building internal tools on LLMs — corporate knowledge bases, analytical assistants, document management systems — this is a practically applicable result right now. No need to wait for the next model version, allocate a budget for fine-tuning, or change the architecture.
It's enough to write the system prompt correctly — and the model starts behaving the way business needs, not the way it was trained to please random evaluators.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.