Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable

wpnews.pro

cd /news/artificial-intelligence/cybersecurity-researchers-aren-t-hap… · home › topics › artificial-intelligence › article

[ARTICLE · art-23766] src=techcrunch.com ↗ pub=2026-06-10T16:42Z topic=artificial-intelligence verified=true sentiment=↓ negative

Cybersecurity researchers aren't happy about the guardrails on Anthropic's Fable

Anthropic released its latest AI model Fable on Tuesday as a public, limited version of its powerful cybersecurity model Mythos, but cybersecurity researchers are voicing complaints online about overly restrictive guardrails that block even innocuous tasks like reading blog posts or conducting code reviews. The guardrails, designed to prevent malware development and biological weapons creation, trigger safety measures on any cybersecurity-related request, causing the model to downgrade to a less capable version. Researchers argue the keyword-based restrictions are haphazard and hinder legitimate security work, though some acknowledge Anthropic will likely refine the system over time.

read2 min views16 publishedJun 10, 2026

Anthropic released its latest model Fable on Tuesday, billing it as a public and limited version of its powerful and much-hyped cybersecurity model Mythos.

But not everyone is happy with the restrictions, and a number of cybersecurity researchers and professionals have aired complaints online.

“[Fable] rejects any request that could be tangentially cyber related. Even innocuous tasks like reading a blog post,” said Valentina “Chompie” Palmiotti, a well-known security researcher who works at IBM X-Force.

When a prompt triggers its guardrails, Fable s the chat and says that its “safety measures flagged this message for cybersecurity or biology topics.”

The guardrails were put in place to limit the risk that Fable could be used to develop malware or compromise software — a long-standing concern within Anthropic. The restrictions on biology come from a similar concern around developing biological weapons.

When the AI giant released Mythos in April, it restricted the model to a limited number of companies and organizations in what it called Project Glasswing, an effort to deploy the model to secure critical software and infrastructure. Last week, Anthropic expanded access to Mythos to hundreds of organizations in 15 countries.

But despite the good intentions, many cybersecurity experts are still put off by the haphazard nature of the restrictions. Matt Suiche, a cybersecurity veteran, told TechCrunch that “if you ask it to write secure code, it assumes it is cybersecurity related work instead of software engineering best practices, and you get downgraded.” Fable is programmed to fall back to Claude Opus 4.8 if it hits a guardrail. “It seems to be keyword based, so anything in the lexical field of ‘cybersecurity’ triggers the guardrails.”

Contact Us

Do you have more information about how hackers are using AI? Or how cybersecuity companies are using AI? We’d love to hear from you. From a non-work device and network, you can contact Lorenzo Franceschi-Bicchierai securely on Signal at +1 917 257 1382, or via Telegram and Keybase @lorenzofb, or.

“But it is understandable as we are still in the early days and they are still adapting their guardrails. I am sure they are going to evolve over time as Anthropic and other frontier model companies will collaborate more with the current new generation of cybersecurity companies,” said Suiche, who is a member of the technical staff at Tolmo, an AI cybersecurity startup. “It’s better to catch more people than not enough when you do such a release and to relax the guardrails over time.”

Another researcher griped on X that “even asking for a code review” triggers Fable’s guardrails.

Anthropic did not immediately respond to a request for comment.

Apart from guardrails inside its models, Anthropic requires cybersecurity professionals to apply to the Cyber Verification Program. If they get approved, the applicants have fewer limitations on using Claude for cybersecurity work. OpenAI has a similar program called Trusted Access for Cyber.

source & further reading

techcrunch.com — original article Fish Audio raises $52M seed to build AI voice models for creators and enterprises Recursive Superintelligence signs $410 compute deal with Amazon Cursor makes its biggest India push yet ahead of SpaceX acquisition with localized pricing

~/api · this article 200

$curl api.wpnews.pro/v1/news/cybersecurity-researcher…

Read original on techcrunch.com → techcrunch.com/2026/06/10/cybersecurity-research…

mentioned entities

Anthropic

Fable

Mythos

Valentina Palmiotti

IBM X-Force

Chompie

metadata

slugcybersecurity-researchers-aren-t-happy-about-the-guardrails-on-anthropic-s-fable

topic#artificial-intelligence

secondary4 topics

sentimentnegative

canonicaltechcrunch.com

navigation

← prevI’m using macOS Golden Gate’s Si…

next →Larson: Are insecure code comple…

── more in #artificial-intelligence 4 stories · sorted by recency

cryptobriefing.com · 28 Jul · #artificial-intelligence

Anthropic says Claude found new weaknesses in cryptographic algorithms

fedscoop.com · 28 Jul · #artificial-intelligence

FBI sees Anthropic’s Mythos as a law enforcement challenge

runtimewire.com · 28 Jul · #artificial-intelligence

Researchers publish CryptanalysisBench to verify AI-generated cryptographic attacks

officechai.com · 28 Jul · #artificial-intelligence

Anthropic Says Claude Mythos Has Discovered Weaknesses In Cryptographic Algorithms That Keep Data Safe

── more on @anthropic 3 stories trending now

wpnews · 26 Jul · #artificial-intelligence

Nobel laureate Simon Johnson on the AI race and China’s ‘over-automation’ problem

wpnews · 26 Jul · #artificial-intelligence

China’s Moonshot, Z.AI, and DeepSeek are challenging U.S. AI labs—and beating them on cost

wpnews · 26 Jul · #ai-safety

University of Washington study reveals prompt injection risks lurking in AI agent memory

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required