Anthropic Withholds AI Model Claude Mythos Over Hacking Fears

Anthropic, the San Francisco-based artificial intelligence startup, has announced that it will not release its latest AI model, Claude Mythos, to the public. This decision stems from fears that the tool could be exploited for widespread hacking, as it has demonstrated a remarkable ability to uncover software vulnerabilities.

Uncovering Hidden Vulnerabilities

During a presentation at the HumanX AI conference in San Francisco, Mike Krieger of Anthropic Labs revealed that Claude Mythos has identified thousands of weaknesses in commonly used applications. Many of these vulnerabilities lack existing patches or fixes, with the oldest dating back 27 years. According to Anthropic, none of these flaws were previously noticed by their creators before being pinpointed by the AI model.

Anthropic emphasized that AI models have reached a level where they can surpass all but the most skilled humans in finding and exploiting software vulnerabilities. The company warned in a blog post that the fallout from such capabilities could be severe for economies, public safety, and national security.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

Defensive Collaboration Initiative

Instead of a public release, Anthropic is forming an alliance with cybersecurity specialists and engineers in the open-source community. This initiative, dubbed "Glasswing," aims to use Claude Mythos as a defensive weapon to bolster protections against hacking. Krieger explained that this approach is about "sort of arming them ahead of time" to preempt potential threats.

Key participants in the Glasswing project include cybersecurity firms CrowdStrike and Palo Alto Networks, tech giants Amazon, Apple, and Microsoft, as well as networking companies Cisco and Broadcom. The Linux Foundation is also involved, promoting collaboration on the open-source Linux operating system.

Urgent Need for Protection

Anthony Grieco, Cisco’s chief security and trust officer, highlighted the urgency of this effort in a joint release. He stated, "AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure from cyber threats, and there is no going back." Approximately 40 organizations involved in computer system design, maintenance, or operation have joined the project.

Anthropic is providing about $100 million worth of computing resources for Glasswing. Partners will share findings from Mythos, with early work indicating that AI models can help find and fix software and hardware vulnerabilities at an unprecedented pace and scale.

Risks and Legal Challenges

The decision to withhold Claude Mythos follows a recent leak of some of its code, which prompted Anthropic to issue a warning about unprecedented cybersecurity risks. Elia Zaitsev, chief technology officer at Crowdstrike, noted, "The window between a vulnerability being discovered and being exploited by an adversary has collapsed – what once took months now happens in minutes with AI."

Despite a White House directive in February to terminate all contracts with Anthropic, the company has engaged in discussions with the US government regarding Mythos. A federal court judge has put this directive on hold while a legal challenge by Anthropic proceeds through the courts.

This move underscores the growing concerns about the dual-use nature of advanced AI tools, balancing innovation with security in an increasingly digital world.