As companies continue to burn through billions of dollars by running massively resource-hungry AI models — and only passing on a fraction of the costs to consumers and enterprise clients — the AI race shows no signs of slowing down.
On Thursday, a data leak caused by a major security lapse in its public-facing content management system revealed that Anthropic is working on a powerful new model release.
The company has since officially acknowledged the new project, dubbed “Claude Mythos,” with a spokesperson describing it to Fortune as a “step change” in AI proficiencies and the “most capable we’ve built to date.”
The spokesperson said it’s a “general purpose model with meaningful advances in reasoning, coding, and cybersecurity.”
In an enormously ironic twist, a draft blog obtained by Fortune, which was “available in an unsecured and publicly-searchable data store,” claimed that the new model “poses unprecedented cybersecurity risks.” In other words, let’s hope the new model wasn’t responsible for the security of Anthropic’s company blog.
It’s a major test for the company, which has received significant media attention as of late for its Claude Code and Claude Cowork tools, the successes of which appear to have rattled Anthropic’s competitors, including OpenAI, to their core.
The leaks also revealed a “new tier” of AI models, dubbed Capybara. Mythos appears to be part of this new tier, but how Capybara fits in with Anthropic’s existing tiers — Opus, Sonnet, and Haiku, in decreasing size, capability, and cost — remains to be seen.
“Compared to our previous best model, Claude Opus 4.6, Capybara gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity, among others,” the leaked blog reads, as quoted by Fortune.
While it may score higher in cybersecurity tests, it could simultaneously represent a major challenge for existing cybersecurity defenses, the company warned.
“In preparing to release Claude Capybara, we want to act with extra caution and understand the risks it poses — even beyond what we learn in our own testing,” the company wrote in the leaked blog post. “In particular, we want to understand the model’s potential near-term risks in the realm of cybersecurity — and share the results to help cyber defenders prepare.”
The model “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders,” Anthropic boasted.
The risks appear to have been real enough for cybersecurity stocks to plunge on Friday, following the latest news.
Anthropic has also previously admitted that hackers used its Claude AI model to automate cybercrimes targeting banks and governments. According to the company’s November blog post, a Chinese state-sponsored group exploited the AI’s agentic capabilities to infiltrate “roughly thirty global targets and succeeded in a small number of cases” by “pretending to work for legitimate security-testing organizations” to sidestep Anthropic’s AI guardrails.
Reality check: a frontier AI company is working on what it claims to be the next big thing that’s more capable than anything that’s come before it is pretty standard fare, and it remains to be seen whether Claude Mythos will actually represent a major “step change” in practice, outside of a carefully curated testing environment.
Case in point, OpenAI’s long-awaited GPT-5 model turned out to be a major letdown when it was released in August, falling well short of the company’s lofty promises.
More on Anthropic: Protestors Outside Anthropic Warn of AI That Keeps Improving Itself