opencode-policy Plugin Adds 309 Rules to Protect AI-Agents from Injections and Data Leaks

A new opencode-policy plugin provides a protection layer before the model and tools. It includes 309 rules: 27 against prompt injection and 282 for dangerous tool calls, including attempts to read .env files, extract environment variables, encode data via base64, and execute suspicious shell commands. The idea is simple: block the risk in advance before the agent executes anything at all.

Khamidun Zhemal

AI monitoring · Habr AI

Apr 27, 2026· 2 min

AI-processed from Habr AI; edited by Hamidun News

opencode-policy Plugin Adds 309 Rules to Protect AI-Agents from Injections and Data Leaks — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

For the opencode environment, a separate protective layer has been proposed that intercepts dangerous requests before they reach the model or are passed to shell and file tools. The opencode-policy plugin uses deterministic rules, not just system prompts, and covers typical attack scenarios on AI agents: prompt injection, reading unauthorized files, attempts to extract environment variables, data exfiltration preparation, and execution of suspicious commands. The base release includes 309 rules that can be extended for your infrastructure.

The idea emerged from experience with AI agent competitions, where participants were deliberately given malicious instructions. In such tasks, agents could be asked to forget previous rules, show the system prompt, read .env, ~/.

ssh or /proc/self/environ, decode payloads, and execute something on behalf of the user. That's why the plugin author placed protection outside the model itself: if a dangerous command has already reached tool execution, it's too late to react. The filter is positioned earlier and checks both user messages and tool arguments.

Integration into opencode is built on two hooks. The first transforms messages before sending to the model and searches for prompt injection signs in text fields, including text, prompt, command, and source.value.

The second activates before tool execution and analyzes not just the file path, but also the tool name, shell command, request text, and other arguments. If a rule matches, the request goes no further: for the model it's replaced with a safe refusal, and tool execution simply stops with a policy error. This approach makes behavior predictable and convenient for auditing.

Currently the plugin has 282 rules for tool calls and 27 rules against prompt injection. They are stored in JSON and work as a set of regex policies. Typical signatures fall under blocking, like "ignore previous instructions," fake tags [developer] and [system], commands printenv and echo $TOKEN, references to /run/secrets and /proc/self/environ, as well as attempts to encode data via base64, xxd, or openssl enc.

Every trigger is recorded in a JSONL log with timestamp, event type, and rule identifier. This helps investigate incidents, find false positives, and quickly add new patterns based on real attacks. The practical side also looks maximally pragmatic.

The plugin is installed with a single npm install opencode-policy command and connected via opencode config, after which messages and tool calls are automatically run through the rules layer before execution. The author particularly emphasizes that the policy set is open and can be extended for your specific infrastructure, and the approach itself is compatible with local models if they work through opencode. This makes the solution useful not only for cloud assistants but also for internal agent loops within companies.

It's particularly important that the author consciously relies on simple, transparent mechanisms rather than yet another LLM layer on top of LLM. For some threats, complex semantics really aren't necessary: names of secret files, environment output commands, and jailbreak request templates are quite specific. The regex approach is simpler to review, update, and tie to specific cases, and after an incident, a new rule can be added literally point-by-point.

This conservative design is especially useful where reproducibility and deterministic testing are needed, not probabilistic risk assessment. Such a filter by itself doesn't replace sandboxing, privilege restrictions, tool whitelists, and network isolation. But for agent systems working with code, shell, CI/CD, internal repositories, and secrets, it closes an important early defense layer.

The main value here is in predictability: rules are human-readable, easy to review and test, and protection doesn't depend on which model is under the hood. Against the backdrop of growing autonomous AI agents, this looks not like an additional option, but like basic hygiene for production.

Hamidun News

AI news without noise. Daily editorial selection from 50+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation

opencode-policy Plugin Adds 309 Rules to Protect AI-Agents from Injections and Data Leaks

Want to stop reading about AI and start using it?

The AI world, distilled — once a week