MarkTechPost→ original

Harry Tan released gstack — a workflow system for Claude Code with QA, review, and release

Harry Tan open-sourced gstack — a workflow layer for Claude Code that splits development into separate modes: planning, engineering review, QA, browser…

AI-processed from MarkTechPost; edited by Hamidun News
Harry Tan released gstack — a workflow system for Claude Code with QA, review, and release
Source: MarkTechPost. Collage: Hamidun News.
◐ Listen to article

Harry Tan released gstack as open source — a set of skills for Claude Code that turns working with an AI coding agent into a more rigorous and predictable process. The idea is simple: instead of mixing planning, engineering review, QA, and release into one endless prompt, separate them into distinct modes with clear roles.

What is gstack

At its core, gstack is not a new model and not another agentic framework, but a workflow layer on top of Claude Code. The project packages typical stages of software delivery into separate commands and assigns each its own mode of thinking. Instead of improvisation in the spirit of "build a feature and review it yourself," the developer gets a sequence of steps: first a product formulation, then an engineering plan, then review, browser testing, and only after that release preparation. In other words, gstack tries to add not magic to AI development, but process discipline.

"Eight opinionated workflow skills for Claude Code" — this is how the project describes itself.

The key insight here is not about giving the agent more freedom, but rather the opposite — narrowing the context of each task. According to the project description, this should reduce the number of weak solutions that appear when a single AI simultaneously invents a feature, writes code, tests the interface, and decides whether it can be rolled out to production. gstack separates these roles and forces them to be carried out sequentially, as if different people with different areas of responsibility were working on the task.

Eight modes of operation

At the time of release, the repository contained eight main commands, each responsible for a separate part of the process, rather than "everything at once." This is the main bet of gstack: quality improves not through new intelligence, but through discipline around the already existing coding agent. A separate command checks the product idea, another checks architecture and tests, another checks for risks in the code, and yet another is responsible for the final stage before merge and release.

  • `/plan-ceo-review` — product review of idea and priorities
  • `/plan-eng-review` — architecture, data flow, edge cases and tests
  • `/review` — search for production risks and code problems
  • `/ship` — branch synchronization, test runs and PR preparation
  • `/browse`, `/qa`, `/setup-browser-cookies`, `/retro` — browser, QA, cookie import and retrospective

This breakdown into roles looks especially logical against the backdrop of a typical AI coding problem: the agent quickly writes the happy path, but often misses edge cases, regressions, and UX failures. In gstack, these checks are moved to separate modes so they don't compete with the task of "write code as fast as possible." This doesn't guarantee the absence of errors, but it brings the process itself closer to ordinary engineering practice, where design, implementation, testing, and release are not thrown into one step.

Browser, QA and stack

The most interesting part of gstack is not the markdown skills, but the persistent browser runtime. Instead of launching a new browser for each action, the system starts a long-lived headless Chromium daemon and communicates with it over localhost HTTP. This is needed both for speed and to maintain state between steps.

According to the project description, a cold start of the browser tool takes about 3–5 seconds, and subsequent calls after launch fit into approximately 100–200 milliseconds. Because of this, cookies, tabs, localStorage, and login state are preserved between commands. It's especially important how this browser is integrated into the QA flow.

The `/browse` command gives the agent the ability to go into the application, click through the interface, take screenshots, and see where everything breaks. And `/qa` goes further: it analyzes the diff of the branch, identifies affected routes, and tests precisely those pages and scenarios that could have been impacted by the changes. In an example from the repository, this mode parsed eight modified files, found three affected routes, and tested them against a local instance of the application — linking code changes to actual interface behavior.

From a technical perspective, gstack was also built pragmatically. To use it, you need Claude Code, Git, and Bun 1.0+, and at the time of publication, the repository used Playwright and the `diff` package, with the `/browse` command compiled into a separate executable binary. The set can be installed in `~/.claude/skills/gstack` or placed in the local `.claude/skills/gstack` within the project so that the entire team uses the same process. The authors explain the choice of Bun for simple reasons: compilable binaries, native SQLite access, TypeScript execution without extra boilerplate, and built-in HTTP server through `Bun.serve()`.

What this means

gstack is interesting not as "another set of prompts," but as an attempt to turn AI coding into a repeatable pipeline with checks at each stage. If this approach takes hold, the market will move not only toward stronger models, but also toward more rigorous operational layers on top of them — with separate modes for planning, review, QA, and release. The main shift here is that trust in AI is being built not through model promises, but through verification stages.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

What do you think?
Loading comments…