Anthropic Refutes Pentagon Allegations of Potential AI Tool Sabotage During Wartime
The Pentagon alleged that Anthropic could theoretically manipulate its AI models during wartime—for example, altering Claude's behavior mid-operation. The…
AI-processed from Wired; edited by Hamidun News
Anthropic denies that the company is capable of interfering with the functioning of its AI tools during the height of a military conflict. The statement was prompted by claims from the US Department of Defense, which suggested that Claude's developer could theoretically manipulate the behavior of models even after their deployment in combat conditions. The company's leadership insists this is technically impossible.
The dispute has become public amid growing interest from the American military establishment in commercial AI systems. Over the past two years, the Pentagon has been actively testing large language models for tasks such as logistics, intelligence data analysis, and operations planning. Anthropic is among the vendors with which the Department of Defense is negotiating or already collaborating on several programs.
The Pentagon's claim concerns a potential vector of vulnerability: if the developer retains the ability to remotely change or disable the model, it creates a risk of dependence on a private company in critically important situations. Military departments worldwide have traditionally sought sovereignty over their systems—especially in the context of active combat operations, when infrastructure reliability becomes a matter of life and death.
Anthropc counters: Claude's deployment architecture does not allow the company to interfere with already running instances without the customer's knowledge. The company's leadership emphasizes that a model deployed in an isolated environment of the military department functions independently from Anthropic's servers. The possibility of 'remote shutdown' or covert behavior modification, they say, is a technically unfeasible scenario with correct implementation.
Nevertheless, the question remains controversial. Cybersecurity experts have long pointed out that commercial AI models are updated through fine-tuning and RLHF mechanisms, which under certain conditions can affect system behavior imperceptibly to the end user. Military customers, unlike corporate ones, demand guarantees of model immutability for the entire contract period—a requirement that commercial vendors have not yet been able to meet with standard tools.
The situation reflects a broader contradiction: governments want to use advanced commercial AI systems, but are not willing to accept dependence on their creators. Open-source models—Llama, Mistral, and others—appear more attractive in this context: they can be deployed locally and have their version locked. However, in terms of quality, they currently lag behind proprietary counterparts in tasks requiring complex reasoning.
For Anthropic, the conflict with the Pentagon is a reputational burden at a time when the company is actively positioning itself as a responsible AI developer. Any doubts about Claude's reliability and predictability could slow negotiations with government customers—a segment that in the coming years could become one of the largest markets for large language models.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.