Anthropic Just Leaked Upcoming Model With "Unprecedented Cybersecurity Risks" in the Most Ironic Way Possible

A data leak caused by a major security lapse revealed that Anthropic is working on a new model, which it says is a cybersecurity risk. — Michael M. Santiago/Getty Images

As companies continue to burn through billions of dollars by running massively resource-hungry AI models — and only passing on a fraction of the costs to consumers and enterprise clients — the AI race shows no signs of slowing down.

On Thursday, a data leak caused by a major security lapse in its public-facing content management system revealed that Anthropic is working on a powerful new model release.

The company has since officially acknowledged the new project, dubbed “Claude Mythos,” with a spokesperson describing it to Fortune as a “step change” in AI proficiencies and the “most capable we’ve built to date.”

The spokesperson said it’s a “general purpose model with meaningful advances in reasoning, coding, and cybersecurity.”

In an enormously ironic twist, a draft blog obtained by Fortune, which was “available in an unsecured and publicly-searchable data store,” claimed that the new model “poses unprecedented cybersecurity risks.” In other words, let’s hope the new model wasn’t responsible for the security of Anthropic’s company blog.

It’s a major test for the company, which has received significant media attention as of late for its Claude Code and Claude Cowork tools, the successes of which appear to have rattled Anthropic’s competitors, including OpenAI, to their core.

The leaks also revealed a “new tier” of AI models, dubbed Capybara. Mythos appears to be part of this new tier, but how Capybara fits in with Anthropic’s existing tiers — Opus, Sonnet, and Haiku, in decreasing size, capability, and cost — remains to be seen.

“Compared to our previous best model, Claude Opus 4.6, Capybara gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity, among others,” the leaked blog reads, as quoted by Fortune.

While it may score higher in cybersecurity tests, it could simultaneously represent a major challenge for existing cybersecurity defenses, the company warned.

“In preparing to release Claude Capybara, we want to act with extra caution and understand the risks it poses — even beyond what we learn in our own testing,” the company wrote in the leaked blog post. “In particular, we want to understand the model’s potential near-term risks in the realm of cybersecurity — and share the results to help cyber defenders prepare.”

The model “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders,” Anthropic boasted.

The risks appear to have been real enough for cybersecurity stocks to plunge on Friday, following the latest news.

Anthropic has also previously admitted that hackers used its Claude AI model to automate cybercrimes targeting banks and governments. According to the company’s November blog post, a Chinese state-sponsored group exploited the AI’s agentic capabilities to infiltrate “roughly thirty global targets and succeeded in a small number of cases” by “pretending to work for legitimate security-testing organizations” to sidestep Anthropic’s AI guardrails.

Reality check: a frontier AI company is working on what it claims to be the next big thing that’s more capable than anything that’s come before it is pretty standard fare, and it remains to be seen whether Claude Mythos will actually represent a major “step change” in practice, outside of a carefully curated testing environment.

Case in point, OpenAI’s long-awaited GPT-5 model turned out to be a major letdown when it was released in August, falling well short of the company’s lofty promises.

More on Anthropic: Protestors Outside Anthropic Warn of AI That Keeps Improving Itself

Anthropic Just Leaked Upcoming Model With “Unprecedented Cybersecurity Risks” in the Most Ironic Way Possible

Anthropic’s “Soul Overview” for Claude Has Leaked

Hackers Told Claude They Were Just Conducting a Test to Trick It Into Conducting Real Cybercrimes

Anthropic Safety Researchers Run Into Trouble When New Model Realizes It’s Being Tested

Anthropic Users Melt Down After Company Institutes New Rate Limits

Leaked Slack Messages Show CEO of “Ethical AI” Startup Anthropic Saying It’s Okay to Benefit Dictators

The AI Startup Anthropic, Which Is Always Talking About How Ethical It Is, Just Partnered With Palantir

Dario Amodei Issues Groveling Apology for Daring to Criticize Trump

US Military Using Claude to Select Targets in Iran Strikes

The White House Suddenly Seems Pretty Terrified of Anthropic

Anthropic Just Sent Shockwaves Through the Entire Stock Market by Releasing a New AI Tool

Get Ready for a Catastrophic Leak That Reveals All Your Messages and Search History

OpenAI Releases List of Work Tasks It Says ChatGPT Can Already Replace

Using an AI Browser Lets Hackers Drain Your Bank Account Just by Showing You a Public Reddit Post

OpenAI Alarmed When Its Shiny New AI Model Isn’t as Smart as It Was Supposed to Be

McDonald’s Idiotic AI Hiring System Just Leaked Personal Data About Millions of Job Applicants

OpenAI Accidentally Leaked Its Upcoming o1 Model to Anyone With a Certain Web Address

FOLLOW US

DISCLAIMER(S)

Sign up to see the future, today