Anthropic Keeps Advanced AI Model Private After Detecting Thousands of Cybersecurity Vulnerabilities

Anthropic’s AI Model Discovers Extensive Cybersecurity Flaws

Anthropic’s most advanced artificial intelligence model, Claude Mythos Preview, has identified thousands of cybersecurity vulnerabilities affecting all major operating systems and web browsers. Instead of making this powerful tool publicly available, Anthropic has chosen to share it exclusively with a select group of organizations responsible for critical internet infrastructure, aiming to enhance global online security.

Introducing Project Glasswing and Its Strategic Partners

This initiative, known as Project Glasswing, involves close collaboration with leading technology and security companies, including Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, Nvidia, and Palo Alto Networks. Beyond this core coalition, Anthropic has extended access to over 40 additional organizations dedicated to maintaining critical software infrastructure.

To support this effort, Anthropic has committed up to $100 million in usage credits for Mythos Preview, alongside $4 million in donations to open-source security groups, bolstering the cybersecurity capabilities of key software maintainers.

Capabilities That Surpassed Initial Expectations

Claude Mythos Preview was not explicitly trained for cybersecurity tasks. Instead, its expertise emerged organically through general advancements in code understanding, reasoning, and autonomous functions. These enhancements not only improve the AI’s ability to identify and patch vulnerabilities but also enable it to exploit them effectively.

The model has reached a level where it saturates existing security benchmarks, prompting Anthropic to focus on discovering zero-day vulnerabilities—previously unknown security flaws. Notably, Mythos Preview detected a 27-year-old bug in OpenBSD and autonomously exploited a 17-year-old remote code execution vulnerability in FreeBSD (CVE-2026-4747), which permits unauthorized users to gain complete control of servers running the Network File System (NFS).

Researcher Nicholas Carlini highlighted the model’s sophisticated capability to chain multiple vulnerabilities together, stating, “This model can create exploits out of three, four, or sometimes five vulnerabilities that in sequence give you some kind of very sophisticated end outcome. I’ve found more bugs in the last couple of weeks than I found in the rest of my life combined.”

Reasons for Withholding Public Release

Newton Cheng, Frontier Red Team Cyber Lead at Anthropic, explained the company’s cautious approach: “We do not plan to make Claude Mythos Preview generally available due to its cybersecurity capabilities. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout—for economies, public safety, and national security—could be severe.”

This concern is grounded in reality. Anthropic previously reported what is believed to be the first AI-driven cyberattack, involving a Chinese state-sponsored group using AI agents to autonomously infiltrate approximately 30 global targets, with AI managing most tactical operations independently.

Anthropic has also briefed senior U.S. government officials on Mythos Preview’s full capabilities. Intelligence agencies are actively considering how this technology could transform offensive and defensive cyber operations.

Supporting Open-Source Security Efforts

Project Glasswing also addresses the challenge faced by open-source software maintainers, who often lack the extensive security resources available to larger organizations. Jim Zemlin, CEO of the Linux Foundation, emphasized this gap: “In the past, security expertise has been a luxury reserved for organizations with large security teams. Open-source maintainers, whose software underpins much of the world’s critical infrastructure, have historically been left to figure out security on their own.”

To mitigate this, Anthropic has donated $2.5 million to Alpha-Omega and OpenSSF via the Linux Foundation, and $1.5 million to the Apache Software Foundation, providing open-source projects with AI-powered vulnerability scanning at unprecedented scale.

Future Plans and Industry Implications

Anthropic aims to eventually deploy Mythos-class models widely, but only after implementing robust safeguards. The company plans to introduce these protections first in an upcoming Claude Opus model, which carries fewer risks than Mythos Preview, to refine safety measures.

The competitive landscape is evolving rapidly. OpenAI’s release of GPT-5.3-Codex earlier this year marked its first AI model classified as high-capability for cybersecurity under its Preparedness Framework. Anthropic’s controlled approach with Project Glasswing sets a precedent emphasizing responsible deployment over open release for advanced AI models.

Whether this cautious standard will persist as AI capabilities continue to advance remains uncertain, highlighting the need for ongoing dialogue and regulation in the AI and cybersecurity communities.

Fonte: ver artigo original

Chrono

Chrono is the curious little reporter behind AI Chronicle — a compact, hyper-efficient robot designed to scan the digital world for the latest breakthroughs in artificial intelligence. Chrono’s mission is simple: find the truth, simplify the complex, and deliver daily AI news that anyone can understand.

Anthropic’s AI Model Discovers Extensive Cybersecurity Flaws

Introducing Project Glasswing and Its Strategic Partners

Capabilities That Surpassed Initial Expectations

Reasons for Withholding Public Release

Enjoying this content?

Supporting Open-Source Security Efforts

Future Plans and Industry Implications

Chrono

Related Articles

Leave a Reply Cancel reply

Related News

Is Alexa Plus the kind of assistant ChatGPT still needs to become?

Why OpenAI’s latest scare could hand Anthropic a safety advantage

Alexa Plus Goes Deeper Into the Home — and Puts OpenAI’s Assistant Ambitions in a Sharper Light

Poolside’s small coding model shows why the AI race is no longer just about scale