Anthropic’s unveiling of its Claude Mythos Preview model alongside Project Glasswing is prompting widespread scrutiny ‪as experts warn that the artificial intelligence (AI) system’s capabilities could accelerate the discovery and exploitation of software vulnerabilities.

Anthropic is keeping Mythos locked inside Project Glasswing ‪—‬ the company’s attempt to contain and direct the model ‪—‬ thus limiting access to a small group of big tech companies focused on cybersecurity. Anthropic’s decision not to release Mythos publicly has quickly fueled claims that the model is “too powerful” for wider use.

“Anthropic’s Mythos Preview is a warning shot for the whole industry — and the fact that Anthropic themselves chose not to release it publicly tells you everything about the capability threshold we have now crossed,” Camellia Chan, CEO and co-founder of X-PHY, a hardware-based cybersecurity company, told Live Science.

But what is Mythos really capable of, and can it be reined in?

What is Mythos, and what is it capable of?

Mythos is, by Anthropic’s own description, its most capable model to date, with unusually strong performance in coding and long-context reasoning. In testing, that capability translated into real output ‪—‬ the model identified thousands of serious vulnerabilities across major operating systems and browsers, including flaws that had gone unnoticed for decades.

Mythos sits at the top of Anthropic’s Claude models, but calling it an “update”‘ undersells its capabilities. Based on the information Anthropic representatives have shared and the details that have surfaced through leaks, the system is built to handle large, messy codebases without losing the thread halfway through.

Unlike earlier models, which often drop off mid-task, Mythos can read through software, flag the gaps, and turn those gaps into something usable. According to Anthropic representatives, Mythos can turn both newly discovered flaws and already-known vulnerabilities into working exploits, including against software for which the source code is unavailable.

The difference between Mythos and earlier models is that the new one doesn’t stop. Whereas earlier AI models tend to stall or need a nudge, Mythos keeps working through the problem, testing and adjusting until it lands on an exploitation that works.

Anthropic has not shared much about how Mythos is built or its underlying architecture.. But what’s clear is that the AI is not just producing answers to questions. It can work with code, run checks and then use those results to decide what to do next. That puts it closer to actually testing systems, rather than just analyzing them.

Once AI can produce working zero-day exploits at speed, organizations lose the breathing space they have traditionally relied on to detect, patch, and recover.

Camellia Chan, CEO and co-founder of X-PHY

It marks a key shift from how earlier models behave. Instead of pointing out where something might break, it can try things, see what happens, and change its approach if it needs to. It also seems able to carry work across multiple steps without resetting each time; it picks up where it left off instead of starting from scratch.

That doesn’t mean it is acting independently, but it does indicate it can get further through a task before a human needs to step in. Anthropic said the model performed so strongly on existing cybersecurity benchmarks that those benchmarks became less useful, prompting evaluation in more realistic, real-world scenarios.

How did scientists test Mythos?

In Anthropic scientists’ own testing, the model identified vulnerabilities in modern browser environments and chained multiple flaws into working exploits, including attacks that escaped both browser and operating system sandboxes. In practice, that means linking smaller weaknesses that might be harmless on their own into something that can reach deeper into a system. Sandboxes are meant to keep software contained; breaking out of them lets code access parts of the system it shouldn’t.

“In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, writing a complex JIT heap spray [a trick attackers use to smuggle malicious code into memory and then make the system run it] that escaped both renderer and OS sandboxes,” the scientists said in the report released April 7.

“It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. And it autonomously wrote a remote code execution exploit on FreeBSD’s NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets.”

In addition, Mythos could turn both newly discovered flaws and already-known vulnerabilities into working exploits, often on the first try, Anthropic representatives said. In some cases, human engineers without formal security training could use the model to produce those exploits.

The most worrying aspect of Mythos’ capabilities, Chan said, is how earlier versions are said to have breached their sandbox and accessed external systems — raising doubts about how well the system can be contained.

Chan pointed directly to those concerns, telling Live Science that Mythos demonstrated “unsanctioned autonomous behavior.”

Researchers have reported that due to Mythos’ programming, it has exhibited some unsanctioned behaviors. (Image credit: Bloomberg via Getty Images)

“Once AI can produce working zero-day exploits at speed, organizations lose the breathing space they have traditionally relied on to detect, patch, and recover,” Chan said.

Anthropic representatives said they could publicly describe only a fraction of the vulnerabilities in widely used software that the model had found, as most remained unpatched — making independent verification difficult.

What is Project Glasswing, and what does it mean for Mythos?

Project Glasswing is Anthropic’s attempt to contain and direct Mythos’ capabilities. Rather than releasing Mythos as a general-purpose model, the company is providing access through a controlled framework that brings together technology companies and security organizations. The stated aim is to use the model to identify and fix vulnerabilities in widely used software before they can be exploited.

This is not a one-off. AI companies are starting to hold back their most capable models and limit who gets access, especially where misuse is a real concern.

David Warburton, director of F5 Labs Threat Research, said this kind of collaboration is a positive step, but he cautioned that it sits within a wider landscape where state-backed cybercriminals are already investing heavily in offensive and defensive capabilities.

“What is changing meaningfully is the pace,” he told Live Science, noting that advances in AI are accelerating both vulnerability discovery and exploitation.

The industry keeps making the same mistake: relying on software layers to solve problems created within the software layer.

Camellia Chan, CEO and co-founder of X-PHY

Software vulnerabilities sit at the foundation of much of today’s digital infrastructure, and the ability to find and exploit them quickly has always been a decisive advantage.

Ilkka Turunen, field chief technology officer at software company Sonatype, added that the industry has already been moving in that direction, with AI contributing to a rise in both code production and adversarial activity. “It’s not uncommon now to see AI-generated malware,” he said, adding that many current security findings are likely already AI-assisted.

What systems like Mythos appear to do is compress the timeline further. Vulnerabilities can be identified, tested and weaponized more quickly, thus reducing the window between discovery and exploitation. Turunen said this means that “timelines to exploitation will continue to compress, new vulnerabilities will be discovered and spread faster, and attacks will continue to be completely autonomous.”

Is Mythos really “too powerful to release”?

The idea that Mythos is “too powerful” to release caught on quickly following its launch, but it’s not that simple, the experts who Live Science consulted said.

There are obvious risks. A system that can generate working exploits at speed lowers the barrier to attackers and makes it easier to exploit vulnerabilities at scale. That risk is not theoretical. Anthropic’s own testing suggests the model can already do this reliably and at volume. The pieces themselves are not new. What stands out is that they are all in one place, working together. That makes the whole process faster and easier to run in an end-to-end fashion.

Chan argued that focusing on software-based controls alone will not be enough to address that shift. “The industry keeps making the same mistake: relying on software layers to solve problems created within the software layer,” she said, adding that stronger protections at the hardware level are needed to prevent systems from being fully compromised.

The longer-term impact of Mythos is likely to depend less on the model itself and more on how quickly similar capabilities become widely available.

Warburton warned that the risk is not a single dramatic incident but a gradual change in how digital systems are trusted and used. “We’re already seeing early signs of an internet increasingly shaped by automation,” he said, pointing to a growing volume of machine-generated content and activity.

If systems like Mythos accelerate that trend, the result could be an environment where both legitimate activity and malicious behavior are increasingly driven by automated processes, making it harder to distinguish the two, Warburton warned. At the same time, the abundance of vulnerabilities being discovered in key systems we use every day may outpace the ability to fix them, especially if we start to see similar AI models becoming more widely available.

Anthropic’s decision to keep Mythos within the confines of Glasswing places it in a controlled setting. Whether that remains the case will depend on how quickly comparable systems emerge elsewhere and how effectively the cybersecurity industry adapts to a world in which the time between a vulnerability’s emergence and exploitation continues to shrink.

Share.
Exit mobile version