Anthropic on AI Agents in Cybersecurity: Capabilities and Pitfalls

Q: What is the source?

Originally published on Habr AI. Hamidun News processes and adapts the material with AI.

Q: When was it published?

Jun 15, 2026. Reading time: 3 min.

Anthropic released research on the application of AI agents in cybersecurity. Agents handle routine tasks — finding SQL injections, XSS, building threat…

Hamidun News Editorial

AI monitoring · Habr AI

Jun 15, 2026· 2 min

AI-processed from Habr AI; edited by Hamidun News

Anthropic on AI Agents in Cybersecurity: Capabilities and Pitfalls — Source: Habr AI. Collage: Hamidun News.

◐ Listen to article

Anthropic published research on the application of AI agents to cybersecurity tasks. Edgar Sipki, a Habr author and founder of easyp & sipki tech, took it upon himself to analyze the document and ask an uncomfortable question: how much can we actually trust these agents in practice?

What Anthropic Says

The company tested Claude agents on a wide range of information security tasks — static code analysis, vulnerability detection, threat model building, and infrastructure security assessment. In typical scenarios, agents showed results above the average specialist level: they processed large code bases faster and caught common vulnerability patterns that are easy to miss during manual review under deadline pressure.

Areas where agents already provide practical value:

Static code analysis — detection of SQL injections, XSS, unsafe dependencies, and hardcoded secrets
Automated threat model building for new services
Accelerated penetration testing: the agent maps the attack surface, the specialist focuses on complex vectors
Generation of detailed risk reports and recommendations for prioritizing fixes
Monitoring code base changes for security regressions

On paper it sounds convincing. But with a closer look, the picture becomes more complicated.

Where Agents Fall Short

The main problem is quality of work on edge cases. Agents hallucinate vulnerabilities that don't exist and simultaneously miss real problems hidden in non-standard code or specific business logic. In the context of cybersecurity, this is especially critical. A false positive wastes team resources on investigating a non-existent threat. A false negative leaves a real hole open while creating a false sense of security. The second scenario is worse than the absence of a check altogether: it lulls vigilance to sleep.

Another weakness is limited system context. The agent only sees what it is provided. Vulnerabilities tied to the interaction of multiple components, deployment specifics, or a particular cloud environment often go unnoticed — they require understanding the entire architecture, not individual files.

Access Permissions — A Separate Problem

A serious question that rarely surfaces in marketing materials: what permissions does an agent need to work effectively in a security context? Full infrastructure scanning requires elevated privileges. And elevated privileges themselves become an attack vector: if the agent is compromised or makes an error with consequences — the scale of the problem grows rapidly. Anthropic recommends the principle of least privilege and isolated environments for agent-based security tasks. But proper configuration requires additional engineering work — and under tight deadlines, it is often skipped. This creates exactly the hole that the agent is supposed to fill.

"Agents are not a replacement for security specialists, but a tool to accelerate their work" — a key thesis of the

Anthropic research.

What This Means

AI agents already provide value in routine security tasks: standard code review, initial scanning, report preparation. But entrusting them with autonomous authority over critical infrastructure is premature for now. The main conclusion, which Edgar Sipki also draws, is that agents change not the composition of the security team, but its toolkit. The human in the loop remains mandatory — especially where the cost of error is high.

Hamidun News

AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.

Telegram channel RSS hamidun.com

Want to stop reading about AI and start using it?

AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.

🎓 Academy — 7 days free Free consultation