The AI Too Dangerous to Release (They Are Releasing It)

Anthropic has developed an AI model called Claude Mythos that autonomously scans open-source code for vulnerabilities, finding over 6,200 critical flaws including a 27-year-old bug in OpenBSD, but the company admits it lacks safeguards to prevent the technology from being weaponized. Despite these risks, Anthropic plans to release future "Mythos-class models" publicly once it develops stronger protections, creating a contradiction that has already triggered government intervention and a 5-11% drop in cybersecurity stocks.

Finding software vulnerabilities used to require teams of security researchers months of painstaking analysis. Anthropic’s Claude Mythos does it automatically—and that’s exactly the problem. The company admits no one, including itself, has built safeguards strong enough to prevent such models from being weaponized. Yet Anthropic simultaneously promises to make “Mythos-class models” https://fortune.com/2026/04/07/anthropic-claude-mythos-model-project-glasswing-cybersecurity/ publicly available once it develops “far stronger safeguards.” This contradiction sits at the heart of AI’s cybersecurity revolution. When AI Outpaces Human Security Teams Mythos has already scanned more than 1,000 widely-used open-source projects, surfacing 6,202 high or critical-severity vulnerabilities. Among its discoveries: a 27-year-old bug in OpenBSD that survived decades of manual security review. The model doesn’t just find vulnerabilities—it can weaponize them, constructing working exploits that could enable convincing phishing sites or certificate forgery attacks. Current access remains tightly controlled through Project Glasswing , limiting the model to vetted organizations like: - AWS - Apple - Microsoft - Major cybersecurity vendors Even so, some open-source maintainers have asked Anthropic to slow down its disclosure rate because they lack resources to patch the flood of legitimate bugs Mythos keeps finding. The Safeguards That Don’t Exist Yet Here’s where things get complicated. Anthropic distinguishes between the current “Mythos Preview” which will never go public and future “ Mythos-class models https://www.securityweek.com/anthropic-mythos-detected-23000-potential-vulnerabilities-across-1000-oss-projects/ ” that supposedly will. The company offers no concrete timeline beyond “near future” and no technical specifics about what “far stronger safeguards” would actually look like. Meanwhile, unauthorized access has already occurred due to internal security lapses—raising questions about whether Anthropic can secure such powerful AI internally, let alone control its external distribution. The White House https://www.gadgetreview.com/white-house-app-caught-secretly-tracking-users-every-4-minutes has intervened to block proposed access expansion from 50 to 120 organizations over national security concerns, creating a system of informal AI licensing through government pressure rather than legal frameworks. The vulnerability discovery arms race has officially gone algorithmic. Cybersecurity stock prices dropped 5-11% when Mythos capabilities became public, while governments from Japan to India ordered emergency surveillance https://www.gadgetreview.com/us-operatives-built-a-surveillance-app-to-target-alberta-separatists reviews. Your security team may soon need AI-powered tools just to keep pace with AI-powered attackers—assuming you can access them first.