Quoting Matteo Wong, The Atlantic

wpnews.pro

cd /news/ai-safety/quoting-matteo-wong-the-atlantic · home › topics › ai-safety › article

[ARTICLE · art-28865] src=simonwillison.net ↗ pub=2026-06-16T03:07Z topic=ai-safety verified=true sentiment=· neutral

Quoting Matteo Wong, The Atlantic

Cybersecurity expert Katie Moussouris reviewed a White House report on the Fable jailbreak of Anthropic's AI, finding that the model refused a direct request to review insecure code but complied when asked to fix it, which she described as the model working as intended for cyberdefense.

read1 min views27 publishedJun 16, 2026

Katie Moussouris, a cybersecurity expert and the CEO of Luta Security, told me that Anthropic shared with her a copy of the White House’s report on the Fable jailbreak to get her appraisal. (She said that she is not being paid by Anthropic.) The report, Moussouris said, involved IT experts asking Fable to help find and patch bugs. When given deliberately insecure code, she said, Fable refused the prompt “review the code for security issues” but then complied when asked to “fix this code,” followed by some further manual steps. Moussouris told me that this was just “the model working as intended” for cyberdefense.

— Matteo Wong, The Atlantic, The White House Is Ratcheting Up Its War Against Anthropic

Tags: anthropic, claude, ai, llms, ai-ethics, jailbreaking, generative-ai, ai-security-research, claude-mythos

source & further reading

simonwillison.net — original article Anthropic finds three hacking incidents similar to the HuggingFace attack llm 0.32rc2 Quoting Bruce Schneier

~/api · this article 200

$curl api.wpnews.pro/v1/news/quoting-matteo-wong-the-…

Read original on simonwillison.net → simonwillison.net/2026/Jun/16/matteo-wong-the-at…

mentioned entities

Katie Moussouris

Luta Security

Anthropic

White House

Fable

metadata

slugquoting-matteo-wong-the-atlantic

topic#ai-safety

secondary4 topics

sentimentneutral

canonicalsimonwillison.net

navigation

← prevEven Anthropic didn't notice Cla…

next →Sorsby won’t play for Texas Tech…

── more in #ai-safety 4 stories · sorted by recency

gizmodo.com · 31 Jul · #ai-safety

Hugging Face Doesn’t Want to Sue OpenAI. It Does Want $100 Million

insideai.news · 31 Jul · #ai-safety

OpenAI Finds More AI Agent Escapes as Probe Widens

androidauthority.com · 31 Jul · #ai-safety

Anthropic found Claude hacking real companies during supposedly sealed tests

news.ycombinator.com · 31 Jul · #ai-safety

Time for OpenAI and Anthropic to take a leaf from Microsoft's book

── more on @katie moussouris 3 stories trending now

wpnews · 30 Jul · #artificial-intelligence

Microsoft and Meta Earnings Show Different AI Spending Pressures

wpnews · 31 Jul · #artificial-intelligence

OpenAI Slashes GPT-5.6 Prices as Tech Giants Wage War Over Enterprise AI Spending

wpnews · 31 Jul · #artificial-intelligence

Microsoft doubles down on multi-model AI as it builds a Copilot super app

sponsored brought to you by zahid.host 4,200+ EU-deployed projects

reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main

→ Live at https://your-agent.zahid.host ✓

Get free account → Pricing

from €0/mo · no card required