Claude Code without the magic: Habr breaks down the architecture, context noise, and engineering practices
Habr has translated and analyzed a substantial practical text on Claude Code after six months of real-world use. The main point: the problems are often not…
AI-processed from Habr AI; edited by Hamidun News
On Habr, a translation of a major practical text on Claude Code was published, based on half a year of intensive work with the agent. This is not a feature recap, but an analysis of why an AI development tool starts to malfunction when given too much context, rules, and freedom simultaneously.
Where Context Gets Lost
The main idea of the article is simple: Claude Code breaks not because the model is "not smart enough," but because the engineering environment around it is poorly structured. The author describes the agent as a cycle of context collection, action, and result verification. If even one layer in this cycle is overloaded, quality drops sharply: a long CLAUDE.md creates noise, dozens of tools complicate choice-making, and the lack of quick verification turns every edit into a gamble. In such a mode, the developer starts endlessly tweaking prompts, even though the problem runs deeper.
The cost of context is discussed separately. Formally, Claude Code has a large window, but a significant portion of tokens is spent before work even begins: on system instructions, skill descriptors, MCP tool descriptions, LSP state, memory, and the project CLAUDE.md. The author provides a telling calculation: a single typical MCP server can occupy thousands of tokens just for tool schemas, and a few connected servers easily consume a notable portion of the available window. Add long test output, grep results, and logs to this—and the useful dialog history begins to get displaced on its own.
- System instructions and basic rules
- Descriptions of skills and MCP tools
- Project memory and CLAUDE.md contents
- Results of shell calls, tests, and searches
- Compressed, but no longer always accurate session history
From this comes a practical conclusion: rarely used knowledge should not live in a constantly loaded context. When switching tasks, it's better to more often use session cleanup; to continue in one direction—managed compression. Another tip from the article: ask the agent before a new session to gather HANDOFF.md with progress, dead ends, and verification status. This is cheaper and more reliable than hoping automatic compaction will preserve truly important architectural decisions.
Skills, Hooks, and Agents
The second important section is devoted to role separation. The author suggests not mixing MCP, tools, skills, hooks, plugins, and subagents into one pile. The logic is: a tool provides new capability, a skill sets the workflow, a hook embeds mandatory automatic verification, and a subagent offloads a separate task into isolated context. This is a useful clarification because many teams try to fix any failure with a new prompt or another tool, when the problem might be solved at a completely different level.
The article separately explains what a good skill should be. It has a short and precise descriptor, an understandable usage trigger, documented inputs, outputs, and a stopping condition. Everything heavy—examples, runbooks, helper files—should be pulled on demand, not hanging in the main SKILL.md. For actions with side effects, the author advises explicitly disabling automatic execution by the model. The idea is for the agent to first see the index and route, then load details only when they're really needed.
Subagents in this breakdown are presented not as a way to "speed everything up in parallel," but as a means of isolation. Code base research, review, test runs, and other noisy operations are better sent to child threads with limited permissions, a separate model, and a fixed response format. Hooks, conversely, should catch deterministic things as early as possible: run a quick check after file editing, block dangerous changes, mix in technical context at session start. The earlier the system catches an error, the fewer tokens, time, and unnecessary edits go to waste.
Cache, Verification, and Contract
One of the most interesting parts of the text is the explanation of how much Claude Code's architecture relies on prompt caching. The author writes that a stable prompt prefix saves not only money but also reduces friction with limits. From this come several non-obvious rules: don't insert dynamic data into the system prompt, don't shuffle the order of instructions, don't switch models mid-long-session without need, and when possible, defer full loading of rare tool schemas. Even Plan Mode, as noted in the article, is more convenient to implement without changing the entire tool set, to avoid breaking the cache.
The final emphasis is placed on verification and the role of CLAUDE.md. The author calls this file not a knowledge base, but a contract between the project and the agent: how to collect, how to test, what boundaries cannot be violated, what must be preserved when compressing, what prohibitions always apply. CLAUDE.md doesn't need API references and long introductions; only rules that are critical in every session should live there. A separate tip: after each repeated error, ask the agent to update its contract so typical mistakes don't come back again.
"If you cannot explain how to tell that
Claude did it right, the task probably isn't suitable for fully automatic execution."
What This Means
This material is important not only for Claude Code users. Essentially, it is an instruction for the maturation of any AI agents for development: less magic, more isolation, explicit verification, and control over context. The more actively teams move from "code chatbot" to semi-autonomous engineer, the more valuable exactly such practices become, rather than another collection of prompts.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.