The Anthropic Fable Mess

wpnews.pro

Need a catchup? We have you covered. #

Happy Sunday! Below is my summary of the Anthropic, Mythos, Fable, government saga that has been The Topic since Friday. Consider it an opinionated tick-tock of what matters. Enjoy! — Alex

The Timeline

February/March: Anthropic and the Department of Defense get into a spat over the use of the company’s AI models; Anthropic wants certain limits on how its technology is used. The ensuing fight leads to Anthropic being dubbed a supply-chain risk, theoretically restricting government use of the company’s models and, in the same stroke, limiting government contractors’ access to the same.

**April 7: **Anthropic introduces Mythos, a new model family. After discovering that Mythos is adept at finding and exploiting novel cybersecurity flaws, Anthropic launches ‘Project Glasswing‘ to provide critical technology companies with access to tools to harden their own software before a broader release.

April 16: The White House was reportedly working to make a version of Mythos available for agency use.

**April 30: **Anthropic wants to expand the number of groups that can access Mythos; the White House reportedly opposes the move, concerned that adding more partners would make Anthropic compute-constrained, thereby limiting USG access to the model.

June 2: Anthropic announces that Project Glasswing’s initial 50 partners found more than 10,000 serious software flaws by using Mythos. The company expands Project Glasswing to 150 new organizations in 15 countries. Anthropic also promises that it is working to “safely release Mythos-level capabilities in general access,” but that the “highly robust safeguards that prevent the model’s cyber capabilities from being misused–safeguards that we (and, to our knowledge, all other AI developers) have yet to develop.”

**June 9: **Anthropic announces and releases Fable 5, a ‘Mythos-class’ model built to reduce cybersecurity and biology-related risks. The company said in its release notes that it had built safeguards for Mythos (Fable) sufficient “for a general release.” At the time, Anthropic said that it had “prioritized safety” by making the guardrails on Fable 5 “stricter than would be ideal.”

Some users found Fable 5 sorestricted in certain use cases (mostly biology-related queries) that it was nigh useless. - Fable 5 also featured distillation protections and a strict 30-day data retention policy that Anthropic claimed would help “defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help [it] identify and reduce false positives.”
At this juncture, the core complaint levied against Anthropic was that it was toocautious; that by keeping Mythos private and releasing only a security-minded version of the model (Fable), it was creating atwo-tier AI market. People didn’t like that!

Anthropic had “previously notified the government multiple times” about its June 9th release date for Fable before the launch.

June 10: Anthropic releases two frameworks to address advanced AI development and its economic impacts. The papers called for “government action and for regulation—regulations that are carefully designed to prevent government overreach and protect innovation,” including, when “a model poses risks of this kind,” allowing the government to “have the legal authority to block or deter its deployment.”

June 11: Amazon’s Andy Jassy reportedly told the government that its researchers had found a way to get Fable 5 to “provide [it] with information that could be used to aid cyberattacks and was supposed to be off-limits.” (At least five other companies chimed in, too, making this less an Amazon-specific point, even if it was a key actor.)

June 12: Senior White House staff and administration leaders met to discuss the situation, then brought Anthropic CEO Dario Amodei into the conversation via phone. (This took 1.25 hours per Politico, with the company saying that it offered other senior execs in the interim; there’s beef about why it took so long to get Amodei on the horn. We can skip the drama and focus on what matters.)

June 12, continued: Amodei viewed the issue as a misunderstanding and argued that the reported “bypass” did not “pose the same risk as a broader ‘jailbreak.’” The White House “urged” Anthropic “to voluntarily remove the model and coordinate with the government to address the vulnerabilities.” Amodei asked for more “time and information, but he made no commitments to pull the model.”

Politico reports that Treasury Secretary Scott Bessent “told Amodei directly that he was making a ‘bad decision,’ according to the senior White House official.”

June 12, even more: After failing to reach a deal with Anthropic, the Trump administration imposed export controls on both Fable 5 and Mythos 5 (the two available versions of Mythos-class models, with differing safety limits).

Anthropic respondsby saying that the “export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees” meant it must “abruptly disable Fable 5 and Mythos 5 forall our customers to ensure compliance.” (Emphasis original.)

Good Guy Anthropic, the argument #

Anthropic built a new, more powerful AI model that it believed had unique, novel cyber-related capabilities. It wanted to get ahead of the model’s cybersecurity risks, so it quickly assembled a group of leading technology companies and provided them with subsidized early access; governments red-teamed Fable’s safeguards before launch.

After fighting to secure more partners’ access to Mythos, Anthropic built new guardrails for the model and even worked with the administration to ensure it had built a sufficiently safe system to release Fable 5 — the constrained version of Mythos 5 — to the public. To wit:

In the weeks leading up to the launch of Fable, Anthropic worked with the US government, the UK AISI, multiple private third-party organizations and internal teams to red-team Fable’s safeguards for thousands of hours in total.
These tests showed that Fable’s safeguards are substantially more effective than those of any previously deployed model.
No testers have yet been able to find a universal jailbreak—a jailbreak method that can very broadly bypass the model’s safeguards, unblocking a wide range of cyber capabilities.

So, when the government tried to convince the company to voluntarily pull Fable 5 in light of the partial jailbreak that Amazon uncovered, Anthropic demurred in part because the issue was not a complete jailbreak. Even more, Anthropic said that it “reviewed a report that we believe is the basis of the government’s directive and validated that the level of capability displayed there is widely available from other models.”

Anthropic has argued that some government regulation of AI is a good idea. The way things went down was not what it had in mind:

As we have

[stated][publicly], we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles.

The company is not standing alone in its views. As Axios reports, some cybersecurity leaders view the government’s actions as out of line.

**To sum: **Anthropic went slow, tried to be the Good Guy by giving companies early access to Mythos so they could find flaws and patch their code before similar capabilities became widespread, moved mountains to release a version of Mythos to the public, and was slapped by the government for its efforts.

The best argument in favor of Anthropic is that it bent over backward to release truly frontier-level AI to the world as quickly as it could, while also showing reasonable restraint *.*Anthropic turned down buckets of revenue by not releasing Mythos earlier. (By trading some of the model’s time atop the benchmark charts, the company willingly deprived itself of revenue to try and keep people safe.)

Bad Guy Anthropic, the argument #

There are two main arguments for Anthropic being the Bad Guy in all of this:

Anthropic cried wolf about AI risk and thus deserves all that has come its way: Anthropic’s posture that we needsomeAI regulation is unpopular in technology circles. Despite some leading tech figureshaving called for AI development s before, the company has become the poster child for a rules-based AI market in the United States.- Anthropic made a major fuss about Mythos’s abilities, which some in technology-land viewed as specious fear-mongering more than technical fact.

Do not cry wolf if you don’t want people to ask you to stay inside your house while they hunt the wolf!

Anthropic should have done what the government wanted, and thus deserves all that has come its way: Former US AI leader David Sackswrotethat Mythos-level capabilities are too powerful to release to the public, and therefore the situation is Anthropic’s fault:“Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.)”

So long as Anthropic simply did what the government wanted (pull Fable of its own accord, and work with the government to make it more secure), there’s no problem! (Some technologists disagree.) - Axios has a killer line in its reporting:“The source familiar with the government’s thinkingsaid there was a ‘lack of seriousness’ that Anthropic was applying to the release of Fable.”

There’s no shortage of anti-Anthropic schadenfreude in technology circles today. For a host of reasons. But if I had to boil them down, many tech figures despise Anthropic because it’s a company run by do-gooder nerds who think that their approach to AI development balances economic, technological, and humanistic requirements. Others in tech circles simply want AI development to go faster.

Ironically, the same people who are most opposed to AI regulation are also the folks who victim-blame Anthropic for not bending over backward to let the government determine its model release schedule.

I doubt that they would be singing the same tuneif xAI had built the market-advancing AI model and Democrats were in the White House.Something to keep in mind.

The strongest argument against Anthropic is that, after it spent so much time claiming that Mythos was uniquely dangerous, it was an error to presume the government would allow imperfect guardrails to suffice.

Where I land #

I side with Anthropic’s view of the events over the government’s own (and the views of its boosters). From what I can ferret out, Anthropic did more than was required of it to prepare Mythos for safe release, effectively charting a new path for AI companies to build, test, share, and release models with novel capabilities. And when it comes to smaller jailbreaks, I do not believe that a full pulling of Fable was required to keep the public safe. (I do reserve the right to change my mind when presented with new information.)

Regarding Amazon’s role in surfacing the Fable concern, I don’t grant infinite influence. Several companies found potential issues. Even if Amazon had not, the result would have been the same: The government wanted to pull Fable, Anthropic didn’t think it was necessary, and the government swung the axe; the DoD wasn’t the eye of this particular storm (as far as we can tell at this juncture).

The confidence bar you have to reach to commend the government for its actions and shift all blame to Anthropic in the mess should be very high. I don’t think that people already known for hating Anthropic have cleared it in their post-Fable criticisms of the company.

Put another way, would we have had Anthropic proceed more slowly in developing Mythos guardrails so it could release Fable? It was mere days ago that the Claude maker was in the barrel for being *too *conservative!

Still, one hell of an ad for Fable, yeah?

One final thought: The need for the world (ex-USA, ex-China) to access AI models of similar power puts France’s Mistral in pole position to capture market share outside the two leading AI hubs. Expect its funding round to close quickly, and for European (and world) governments to urge it forward in building Mythos-level capabilities for nations currently bereft.

source & further reading

cautiousoptimism.news — original article Microsoft is winning the boring way Moonshot opens the open-model tollbooth Silicon Valley discovers peer pressure

The Anthropic Fable Mess

Need a catchup? We have you covered. #

Good Guy Anthropic, the argument #

Bad Guy Anthropic, the argument #

Where I land #

Run your AI side-project on zahid.host