Habr AI→ original

Claude Mythos supposedly found a vulnerability, but it was already in the training data

Anthropic touted it: Claude Mythos supposedly discovered and used “the first remote kernel exploit found by AI.” It sounds impressive. But researchers checked a

Claude Mythos supposedly found a vulnerability, but it was already in the training data
Source: Habr AI. Collage: Hamidun News.
◐ Listen to article

Anthropic made a stir in the press, announcing that its latest model Claude Mythos discovered and exploited "the first kernel remote exploit identified and deployed by AI." A catchy claim that quickly spread across news feeds. But when researchers dug into the details, they uncovered a disappointment: the model simply recalled a vulnerability from its training data. It was a bug from 20 years ago that had long been known to specialists.

What Anthropic announced

Against the backdrop of growing interest in the capabilities of advanced language models in cybersecurity, Anthropic claimed that during Claude Mythos testing, the model independently identified a CVE (Common Vulnerabilities and Exposures)—a kernel system vulnerability that theoretically could be exploited for remote code execution. According to the company, this was an impressive example of how cutting-edge AI can find real threats that humans might miss. The story sounded like a breakthrough in cybersecurity automation and sparked a wave of discussion about when AI would start independently finding vulnerabilities.

What researchers found upon inspection

A group of researchers who analyzed the description of this event uncovered an inconvenient truth. The suspected bug is a long-known vulnerability that was already in the public domain and was likely included in Mythos's training data. In other words, the model made no independent discovery—it simply recalled information it had seen during training. It's like boasting that a student independently discovered the Pythagorean theorem when he simply reproduced the formula from a textbook.

The story raises several critical questions:

  • The boundary between memorization and discovery—how do you distinguish what the model memorized from what it truly discovered independently?
  • Lack of independent verification—how can we be sure that loud claims about AI achievements are not overstated?
  • Lack of transparency—why don't companies disclose full details of methodology for independent verification?

Significance for industry

The story illustrates a fundamental problem: when a model has seen data during training, its re-"discovery" of that data is simply recollection, not genuine novel discovery. As AI becomes a tool for critical areas like security, the industry must develop more rigorous standards for evaluating results. Loud claims without transparent methodology and independent verification only undermine trust in the industry and create unrealistic expectations about the capabilities of current models.

ZK
Hamidun News
AI news without noise. Daily editorial selection from 400+ sources. A product by Zhemal Khamidun, Head of AI at Alpina Digital.
What do you think?
Loading comments…