Habr AI: How Pipeline Triad Assembles an AI Agent Pipeline Instead of a Development Team
Instead of a single 'super-agent' for development, a pipeline of triads is proposed: creator, critic, and arbiter. The Pipeline Triad Pattern breaks down the…
AI-processed from Habr AI; edited by Hamidun News
The idea of one universal AI developer is gradually giving way to a more practical scheme: instead of a "super agent," a pipeline of specialized trios is proposed, where one agent creates a result, the second searches for errors, and the third makes a decision. This is exactly how the Pipeline Triad Pattern works — a development model designed not for demonstrations, but for typical enterprise tasks where requirements, standards, and rules are already described, and the human remains in control at several of the most expensive points in the process. The scheme is built on three roles: Creator, Critic, and Arbiter.
The first generates an artifact, the second checks logic, quality, and risks, and the third decides whether to pass the result forward or send it back for rework. This approach relies on a simple idea: language models are poor at correcting their own errors without external verification, so it is more reliable not to strengthen one agent infinitely, but to build an independent trio with different functions. The pattern's author transfers the familiar enterprise principle of maker-checker-approver to agent development and stretches it across the entire SDLC.
At the same time, Pipeline Triad is not proposed as a replacement for CI/CD. Automatic pipelines continue to build, test, and deploy code, but above them appears another layer — a layer of agent delegation, where decisions are made not by rigid script, but taking into account context, regulations, and business rules. The complete scheme consists of 14 steps from task statement to production.
Seven of them execute agent stages: analytics, development, code review, testing, regression, security, and release artifact preparation. Another four points remain with the human: requirements validation, readiness approval, deployment confirmation, and final pre-production verification. At each stage, the trio must deliver not just a text answer, but a formalized package: the artifact itself, PASS or FAIL criteria, a log of the Critic's remarks, as well as the Arbiter's decision — pass, return, or conduct partially.
Due to this, the pipeline transforms from a set of prompts into a reproducible process with decision tracing. The next stage does not start until it receives valid input, which means you can build an audit, measure quality, and analyze errors retrospectively. The practical value of the pattern is best seen on a typical task.
For example, the author takes a banking endpoint for account freezing: first, the trio clarifies requirements and edge cases, then separate trios write code, check access rights, add tests for race conditions, run regression and security checks, after which a person only confirms a few key decisions. In such a scenario, human participation is estimated at approximately one hour of total time versus two to three weeks in a classical enterprise process. In terms of cost, the author estimates a full run through the API at approximately 42–84 model calls, 1–2 million input tokens, and 200–400 thousand output tokens, which gives an order-of-magnitude estimate of $6–12 per task.
For pilots and personal setups, a subscription might be cheaper, but for a stable production flow, you will still need to count limits, budget, and actual token consumption. At the same time, the model has hard boundaries. It works well for change requests, bugfixes, CRUD and API tasks, integrations, and infrastructure changes, where the domain is formalized and the result can be verified by tests and artifacts.
Pipeline Triad works worst where there is much uncertainty: in discovery, greenfield architecture without mature standards, R&D, and large cross-team refactorings. Risks are quite earthly as well: an agent can invent a non-existent business rule, a poorly configured Critic will either let errors slip through or reject everything in a row, and parallel work on multiple tasks will quickly run into context conflicts, migrations, and branch conflicts. A separate section is the security of the pipeline itself.
If agents are given access to the repository, secrets, and deployment without strict restrictions, the new process becomes an additional attack surface. Therefore, the author insists on the principle of least privilege, separate role-based access, complete audit log, a policy engine for tool use, and filtering of sensitive data before they enter the model's context. What does this mean in practice: the idea of a "team of agents" becomes not a fantasy about a universal AI employee, but a more grounded engineering model, where acceleration is achieved through specialization, formalized inputs, and control at expensive steps.
But the material honestly shows the limits of the approach: it does not eliminate organizational alignment, does not solve the problem of poor requirements specification, and does not remove responsibility from strong people who must validate the result. If such pipelines become a working standard, it will first be in predictable enterprise domains where the cost of error is high and rules can already be turned into a verifiable process.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.