Habr AI Explains Why LLMs Don't Calculate, Don't Learn in Dialogue, and Depend on Tools

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

Apr 28, 2026. Reading time: 3 min.

Habr AI debunks two major LLM myths: they don't learn directly in chat and can't do 'everything on their own.' A language model excels at text, but for…

Hamidun News Editorial

AI monitoring · Habr AI

Apr 28, 2026· 3 min

AI-processed from Habr AI; edited by Hamidun News

Habr AI Explains Why LLMs Don't Calculate, Don't Learn in Dialogue, and Depend on Tools — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

An article on Habr AI debunks the popular myth that a modern chatbot is already a universal intelligence by itself. The author's main thesis is simple: a base LLM by its nature can only work with text — accept a text request and generate a text response. Everything else that a user perceives as the "magical abilities" of the model is usually provided by external tools, integrations, and orchestration. That's why the same interfaces can both draw pictures, search the internet, and calculate numbers, even though the language model itself doesn't become an artist, a search engine, or a calculator from this.

The first misconception is linked to the feeling that an LLM "can do everything." If you ask it to create an image, it forms a request for a separate generation model. If you talk to it by voice, speech recognition and voice synthesis are involved in the pipeline.

If you need precise calculation, reliable results usually appear only after calling a code interpreter or another computational tool. Without such extensions, an LLM relies on probabilistic reproduction of patterns from training: it can correctly solve a simple example, but on long numbers, formulas, and tasks requiring high precision it easily makes mistakes. From this follows an important practical boundary: the model's strength is not mathematics as such, but textual description of the task and selection of the appropriate tool.

The second myth is that the model learns right during the conversation. The author reminds us that inference and training are two different processes. When a user writes a request, the model sequentially generates tokens based on already fixed weights, and the weights themselves don't change at that moment.

This means a specific LLM in a specific session doesn't "learn a lesson" and doesn't become smarter from a user's remark. Yes, providers can later use anonymized dialogues to train future versions, but that's already a separate fine-tuning cycle, not magical self-update in the chat. From this also follows another conclusion: user memory between dialogues is usually not model training, but saved context that is then mixed back into the request.

The article then briefly explains what a public LLM consists of. At its core is a transformer that sees all available context at once and builds a response as a sequence of probable tokens, maintaining overall text coherence through learned patterns. On top of that is RLHF — tuning for assistant format, politeness, instruction-following, and safety restrictions.

But RLHF doesn't turn the model into a logic machine and doesn't fix fundamental weaknesses. Therefore, language models are good at text analysis, summarization, style change, step-by-step instructions, working with formats like JSON, and tool selection. They are weak at precise computation, processing large tables, holding vast amounts of data in context, and knowing the world's current state after the training date.

To this add probabilistic nature of the answer, sensitivity to prompt formulation, and risk of hallucinations.

To make an LLM useful in production, an additional layer is built around it. For static knowledge, RAG is used: documents are split into fragments, semantically close pieces are found by request, and the model receives only relevant context. For dynamic data and actions, function calling is applied: the LLM decides when to call an API, database, calculator, or simulation, and the orchestrator validates calls, adds tool responses to the history, and manages the entire cycle. The same orchestrator handles dialogue memory, system prompts, output format validation, and launching subagents.

On this foundation emerge more ambitious concepts — AI agents, digital employees, copilots, and digital twins. In essence, this is not separate magic, but combinations of LLMs, knowledge bases, APIs, automation, and classical computational engines. This means discussing "artificial intelligence" without distinguishing technologies is no longer sufficient.

If a business needs precise calculation, strict automation, or forecasting on structured data, an LLM alone is not enough. If work with emails, documents, instructions, knowledge search, and a dialogue interface to a complex system is needed, an LLM truly gives a strong boost. The sober perspective from the article is useful precisely because it removes excess expectations: there's no need to ascribe superhuman abilities to a language model, but it's also not worth underestimating it as an interface and coordinator of other tools.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation