June 4, 2026
7
minutes

Who Is Project Glasswing Actually For?

What Anthropic's Project Glasswing reveals about where the real constraint in security now sits — and why finding flaws was never the hard part.

In April 2026, Anthropic announced Project Glasswing, a program built around an unreleased AI model called Claude Mythos Preview. The company says the model is so good at finding weaknesses in software that it has chosen not to release it to the public. Instead, it has given access to about fifty organizations, including Amazon, Google, Microsoft, Cisco, and JPMorganChase, so they can use it to harden the software the world depends on. The headline claim is striking. In its first weeks, the model has helped find more than ten thousand serious security flaws in widely used software.

The capability looks real, and it deserves to be taken seriously. But the way a claim like this is framed shapes what the public and policymakers come to believe about it, and the framing here is doing a lot of quiet work. So it is worth digging deeper: what is genuinely new, who actually benefits, and who is accountable? 

A few terms first, in plain language. 

  • Vulnerability — a flaw in software that an attacker can abuse. 
  • Zero-day — a vulnerability  the software's makers do not yet know about. 
  • Patch —the fix. 
  • Open-source software — code built and maintained in the open, often by unpaid volunteers, that almost all other software quietly relies on.
  • Coordinated disclosure — the long-standing practice of telling the maker privately and giving them time to patch before the flaw is made public.

What is Glasswing?

Anthropic reports that most partners have each found hundreds of high-severity flaws, and that several have seen their rate of bug-finding rise more than tenfold. 

Cloudflare found 2,000 bugs across its core systems. Mozilla found and fixed 271 flaws in a single version of Firefox, more than ten times what its previous tools caught. The UK's AI Security Institute says it is the first model to solve both of its simulated cyberattack ranges from end to end. In one case the model built a working method to forge security certificates in wolfSSL, a cryptography library used by billions of devices, which would let an attacker impersonate a bank's website convincingly. That flaw has since been patched.

A mix of partners and independent groups report similar results. The UK's AI Security Institute and the security platform XBOW are genuinely independent, and new academic benchmarks such as ExploitBench, which grades how far an AI agent can climb from finding a flaw to fully exploiting it, also place Mythos at the top. It is worth noting that Anthropic helped fund some of these benchmarks, so the scoreboards are not entirely unbiased. 

Even so, the capability itself looks real. The question is what follows from it.

Finding was never the hard part — patching is

The announcement reads as though this is the beginning of a new era. History says otherwise. Machines have been finding and even fixing software flaws on their own for a decade. In 2016, DARPA's Cyber Grand Challenge had automated systems hunt, patch, and attack software with no humans in the loop. Google's OSS-Fuzz and Project Zero have industrialized bug discovery for years. DARPA ran a two-year AI Cyber Challenge on exactly this premise, with finals in 2025.

That last contest is the useful one to dwell on, because it measured the part that matters. The best teams in the world found 86% of the planted flaws, but patched only 68% of the ones they found. In a clean competition, with elite teams and tidy targets, automated repair topped out around two-thirds. The real world is messier than a competition. The consistent lesson across all of this work is the same: fixing lags finding, and it lags badly.

It is worth being precise about what Glasswing actually changes, because the popular version overstates it. What dropped is the human cost of discovery, not the machine cost. The model needs far less of the rare expert time this work used to demand, and it runs at a speed no person can match. 

But the compute itself is not cheap. 

Anthropic reports that scanning a single target, the operating system OpenBSD, costs close to twenty thousand dollars, and individual long runs can cost far more. The much-repeated figure of fifty dollars to find a 27-year-old flaw was, in Anthropic's own words, a number that only makes sense in hindsight, since no one can know in advance which run will succeed. So discovery has not become free. It has become something you can buy, if you can afford the bill. 

Ten thousand found is not ten thousand fixed

The number doing the most work in the coverage is "ten thousand." It is a discovery number. Safety is a repair number, and that one is much smaller. By Anthropic's own account, its scan of open-source projects produced an estimated 6,202 high or critical flaws. Of those, around 530 have been disclosed to maintainers, and 75 have been patched.

Some of that gap is innocent. Anthropic follows the standard 90-day disclosure window, so many fixes simply have not come due yet, and some quiet patches go uncounted. The company says so, and a fair reading grants it. But Anthropic also concedes the rest plainly: this is, in its words, a genuine problem. Each high-severity bug takes about two weeks to patch on average, and several maintainers have asked the company to slow down because they cannot keep pace.

This is where the framing matters most. A flaw that has been found but not yet fixed is not obviously safer than one nobody knew about. Knowledge can leak. The window between discovery and repair is precisely when a system is most exposed, and Glasswing is manufacturing that window thousands of times over. The clearest sign that Anthropic understands this is its own design choice: for flaws it has not yet disclosed, it publishes only a cryptographic fingerprint of the details and withholds the specifics until a patch exists. That is an admission that the findings themselves are dangerous goods. Counting them as security gains is, at best, premature.

Who actually benefits from Glasswing?

Look at where repair is working and where it is stalling. Among large companies fixing their own code, the speed-up is real. Palo Alto Networks shipped five times its usual patches. Oracle reports fixing flaws several times faster. In open-source software, where repair depends on volunteers and coordinated disclosure, the same flood produced only 75 fixes. 

Same discovery engine, opposite outcomes. The variable is not the AI. It is institutional capacity, budget, and staff.

That is why the benefit concentrates. The roughly fifty organizations with early access are among the best-resourced on earth, and Anthropic itself sits at the center. The company is candid about this, describing Glasswing as an asymmetric advantage for systemically important defenders, while admitting there is an urgent need for everyone else to catch up. The trouble is the second half. Anthropic also says that models this capable will soon be built by many companies, which means the same discovery power will reach attackers and weak defenders before long.

Here the experts disagree in a way the public should hear. The security writer Bruce Schneier argues that, in the long run, AI favors defenders, because a fixed flaw is gone for good while an attack is fleeting. Other researchers, including at Georgetown's CSET and in recent academic work, Uplifted Attackers, Human Defenders, argue that in the near term the benefit reaches attackers and under-resourced defenders unevenly. A small hospital, a school district, or a volunteer maintainer cannot absorb a flood of findings the way Google can. In that sense, Glasswing may widen the security gap before it ever narrows it. The people most exposed are the ones the program does not reach.

The cost lands on those least able to bear it

The open-source layer is not a detail. It is the foundation, and it has been fragile for the same reason for over a decade. When the Heartbleed flaw hit in 2014, the world learned that OpenSSL, which secured a large share of the internet, was maintained on about two thousand dollars a year by two volunteers. 

In 2024, the xz-utils incident showed the other face of the same problem: an attacker spent years gaining the trust of a single exhausted maintainer in order to plant a backdoor in a core component, and it was caught only by luck. The binding constraint in open source was never a shortage of bug-finders. It was underfunded human attention and misplaced trust.

Glasswing pours machine-speed findings onto exactly that layer. Anthropic has committed up to $100 million in credits for its own model and $4 million in donations to the organizations that have to do the fixing. That is a ratio of twenty-five to one between finding and fixing, and the balance points in the wrong direction for where the work actually piles up. Maintainers asking the company to slow down is the clearest signal that the help, as delivered, can function as a burden. Assistance that increases the unpaid workload of the people with the least support is not straightforward assistance.

Who decides

The deepest problem is not technical. It is about authority. Anthropic built a tool with real offensive power, decided on its own that the public could not be trusted with it, and then chose, by criteria it has not disclosed, which organizations may use it. A private company is now making allocation decisions with national-security weight, with no public mandate and no external oversight. Researchers at Georgia Tech's Internet Governance Project describe Glasswing as a "transitional institution," a private club that makes sense only while the capability is rare, and that becomes hard to justify as it spreads.

Several conflicts sit underneath this. The $100 million is in Anthropic's own product credits, so the program also markets the very capability it warns about. The argument that "others will build this soon, so we must lead" is convenient and impossible to disprove, and it is being made by one of the companies driving that proliferation. And the evidence is hard to check independently. Anthropic has not published a technical account of how the model was built, its benchmark figures are self-reported and come with memorization caveats the company itself flags, and the delayed disclosure that protects users also shields the claims from outside scrutiny. Floating an "independent third-party body" as a future home for this work is an honest admission that the right governance does not exist yet. It is not a substitute for having it.

What the episode actually reveals

Strip away the framing and Glasswing's real contribution is not proof that AI can find flaws. It is the demonstration of where the constraint now sits. Once finding stops being the hard part, both value and danger move downstream, into the human work of confirming a finding is real, judging how much it matters, fixing the underlying cause rather than the symptom, and coordinating the fix without tipping off attackers. That work resists automation, and we have built very few institutions, and almost no funding, around it. The honest conclusion is not to point a stronger model at the problem. It is to build the verification and coordination capacity that turns findings into safety, and to pay the people who carry it.

We have reached the same conclusion in a far smaller domain. SecurityPal AI does not hunt for flaws in operating systems. We answer security and compliance questions for companies. But the principle our work runs on is the one Glasswing keeps proving at a much larger scale: a machine's output is a claim, not a fact, and the scarce, decisive resource is the expert judgment that confirms it before anyone relies on it. We call that discipline Hyper-Supervised Assurance Intelligence. Glasswing is a reminder of why it matters, written this time across the whole of the world's software.


Research and analysis by SecurityPal's Data Science team: Pratyush Acharya, and Sweta Shrestha with Habish Dhakal.

No items found.
No items found.
No items found.
Habish Dhakal
Data & Research Scientist

Insights, product updates, and research from the SecurityPal team — delivered to your inbox.

Thanks for subscribing! You’re all set to stay ahead with the latest cybersecurity insights, product updates, and research from the SecurityPal team.
Oops! Something went wrong while submitting the form.

No spam. Unsubscribe any time.