Anthropic Probes Unauthorised Access to High-Risk Mythos AI Model

Anthropic, the prominent AI developer, has officially confirmed that it is actively investigating a report concerning unauthorised access to its Mythos model. This advanced artificial intelligence system has not been released to the public due to its capabilities in detecting cybersecurity vulnerabilities, which could potentially enable cyber-attacks.

Investigation into Third-Party Breach

The US-based startup issued a statement following a Bloomberg report on Wednesday, which detailed that a small group of individuals had accessed the Mythos model. Anthropic clarified that the incident involved a third-party vendor environment, stating, "We're investigating a report claiming unauthorised access to Claude Mythos Preview through one of our third-party vendor environments."

According to Bloomberg, a "handful" of users in a private online forum managed to gain access to Mythos on the same day Anthropic announced its limited release for testing purposes to select companies, including Apple and Goldman Sachs. The report indicated that these unnamed users exploited access privileges held by one individual who worked as a contractor for Anthropic, utilising methods commonly employed by cybersecurity researchers.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

Nature of the Access and Potential Risks

Bloomberg corroborated the claims through screenshots and a live demonstration of the model, noting that the group has not executed cybersecurity prompts on Mythos. Instead, they appear more interested in "playing around" with the technology rather than initiating malicious activities. However, this potential breach is likely to alarm authorities who have previously expressed concerns about Mythos's ability to cause significant harm.

The incident raises critical questions about how such potentially damaging technology can be safeguarded from falling into the wrong hands. Kanishka Narayan, the UK's AI minister, has emphasised that UK businesses "should be worried" about the model's proficiency in identifying flaws in IT systems, which hackers could exploit.

Security Assessments and Capabilities

Mythos has undergone rigorous vetting by the UK's AI Security Institute (AISI), the world's leading safety authority for AI technology. Last week, AISI issued a warning that Mythos represents a "step up" from previous models in terms of the cyber-threat it poses. The institute highlighted that Mythos can autonomously carry out multi-step cyber-attacks and discover weaknesses in IT systems without human intervention, tasks that typically require human professionals several days to complete.

Notably, Mythos achieved a significant milestone by becoming the first AI model to successfully complete a 32-step simulation of a cyber-attack designed by AISI, solving the challenge in three out of ten attempts. This underscores the model's advanced capabilities and the heightened risks associated with its potential misuse.

As the investigation continues, the focus remains on ensuring robust security measures to prevent further unauthorised access and mitigate the cybersecurity threats posed by such powerful AI technologies.