Habr AI→ original

Harness around an LLM delivers multi-fold gains: what changed after a year with Claude Code

After a year and a half with Claude Code, an engineer found that the main quality lever for an LLM is not new model versions but the harness around it: system p

Harness around an LLM delivers multi-fold gains: what changed after a year with Claude Code
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

After eighteen months of intensive work with Claude Code, running dozens of experimental launches and observations across engineering teams, one experienced developer reached an unexpected conclusion: the primary lever for improving LLM quality is not the model generation or upgrading to a fresher version, but rather a sophisticated system of harness around it.

What is harness

In the engineering community, this harness is called—a complete circuit in which the language model operates. It is not simply a prompt passed into a chat, nor is it a single behavioral rule. It is a complex system of the model's interaction with the external world: its rules, boundaries, available tools, and the scope of visible context. Without this circuit, even the most powerful model operates inefficiently—like an excavator without arms that has an engine but has nothing to dig or move earth with.

Why harness delivers greater improvement than new models

Over a year of observation, it became clear that switching to a new model version delivers noticeable but limited quality gains. The transition from Claude 3 to Claude 4 is an improvement, but it amounts to only a few percentage points of performance gain. Meanwhile, each new layer of harness around the same model is an exponential leap. Add a system prompt that clearly describes the model's role—quality jumps. Connect tools through which the model can interact with reality—another jump. Expand the context, add skills, configure permissions—each time quality grows exponentially. This shifts engineering focus away from chasing new versions toward constructing the circuit in which the model operates.

Components of effective harness

An effective harness consists of several key layers, each contributing its own value:

  • System prompt — explicit instructions that define behavior, communication style, and boundaries
  • Tools — model access to external APIs, databases, browsers, and computational resources
  • Context — sophisticated management of what the model sees, remembers, and can rely on
  • Skills — pretrained patterns and algorithms for solving typical tasks
  • Hooks — events and triggers that activate under specific conditions
  • Permissions — the boundary between what the model can and cannot do, which files it can read and write
  • Memory — long-term and short-term retention of project context, decisions, and insights

Each of these layers can be optimized independently, delivering performance improvements for specific task types.

"Power exists, but there's nothing to use it with"—this is how an

experienced engineer describes a model without harness.

What this means

Developer focus shifts from the race for newer and more powerful models toward building an intelligent circuit around existing ones. Investments in system prompt engineering, context management, skills development, and permission optimization pay off faster and deliver greater results than waiting for the next model version. This opens new possibilities for customization to specific tasks and teams, without requiring costly hardware upgrades or licenses.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.
What do you think?
Loading comments…