Anthropic: Claude Mythos Preview Finds Thousands of Critical Code Vulnerabilities
Anthropic unveiled Claude Mythos Preview as a new class of tool for discovering code bugs. According to the company, the model has already identified…
AI-processed from IEEE Spectrum AI; edited by Hamidun News
Anthropic demonstrated how rapidly cybersecurity is changing in the era of generative AI: its Claude Mythos Preview model was able to identify thousands of high and critical-level vulnerabilities, including issues in popular operating systems, browsers, and cryptographic libraries. But the key takeaway is not that AI has learned to find bugs better. The same capabilities that help identify weaknesses in code can also be used to exploit them, so security automation now requires not only speed but also new control rules.
In early April, the Frontier Red Team inside Anthropic reported that Mythos Preview discovered numerous serious issues, even though the model was not specifically trained to search for such vulnerabilities. According to the company, among the findings are defects in virtually all major operating systems and primary browsers. Examples cited include a 27-year-old bug in OpenBSD that allows remote machine compromise, a browser vulnerability through which an attacker can read data from another domain, and weaknesses in cryptographic libraries that could lead to decryption of protected traffic or certificate forgery.
Against the backdrop of these results, Anthropic launched Project Glasswing. The project involves Amazon Web Services, Apple, Google, Microsoft, and Nvidia, and the partnership's task is straightforward: use Mythos Preview to scan software and strengthen its protection. The logic is clear.
If large language models are already capable of analyzing massive codebases, tracking data flow between components, and noticing non-trivial connections between errors, then they become more than just another static analysis tool—they become an instrument that in some ways begins to approach the work of a live security researcher. This is also noted by industry practitioners. They point out that the strong point of such models is not just speed, although that matters in itself, but the ability to reason about code semantics.
Traditional tools with rigid rules primarily search for pattern matches against predetermined templates, while modern LLM systems can trace how data flows through different levels of abstraction and detect a problem that arises only at the junction of several components. For large repositories, this is particularly important: an AI agent can more easily find a rare but dangerous needle in a massive haystack of code. However, this approach has a downside.
Models still produce false positives, may misclassify a bug as a vulnerability or overstate the severity of a problem. For open source project maintainers, this becomes an additional burden: the number of reports grows, and the time to review each signal increases. Another risk is that AI tools can not only be attacked—for example, through prompt injection—but also be used as an offensive mechanism.
The same Mythos Preview, it is claimed, can link multiple separate defects into a step-by-step exploitation chain that ultimately grants root-level access to the Linux kernel. This is why experts speak not of completely replacing humans, but of a multi-layered verification scheme. One approach already used in the industry is adversarial self review: the model first finds a problem, then attempts to challenge its own conclusion or passes the finding to another model for independent validation.
This additional layer helps reduce noise, but does not eliminate manual review. AI conclusions remain probabilistic, which means the final decision should be made by a specialist who understands the product's business logic, system architecture, and real exploitation scenarios. Companies are additionally advised to develop dynamic threat modeling, red teaming, and shift security to the beginning of the development process, so that developers eliminate weak points while writing code, rather than after release.
The main question now is not even whether AI can find vulnerabilities better and faster than humans, but how to integrate it into a secure workflow. The next frontier for such systems is not just detection but also large-scale remediation of found issues. If this stage can be automated without loss of quality and without abandonment of human control, secure software development will accelerate noticeably.
If not, the industry will have another powerful but noisy tool that creates as much work as it saves.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.