cd /news/artificial-intelligence/anthropic-ponders-self-improving-ai · home topics artificial-intelligence article
[ARTICLE · art-22528] src=sherwood.news pub= topic=artificial-intelligence verified=true sentiment=· neutral

Anthropic ponders self-improving AI

Anthropic reported that its AI model Claude now writes 80% of the company's internal code, raising questions about the potential for recursive self-improvement where AI systems autonomously enhance their own capabilities. The company's blog post examines whether such self-improving AI could lead to loss of human control, citing risks similar to those depicted in fictional scenarios like Skynet, while acknowledging that human guidance remains essential for designing critical experiments.

read5 min publishedJun 5, 2026

Anthropic says Claude already writes 80% of its code. A new post asks what happens when the models can improve themselves — and whether anyone could stop them.

As AI models rapidly improve at writing code, the role of humans in the process of software development is shifting to one of merely oversight and direction. Anthropic says that as of May 2026, Claude has written up about 80% of its internal code.

But what happens when the AI models don’t need humans any more, and the models can write the code to improve themselves autonomously?

This concept — known as **recursive self-improvement **— is currently getting a lot of attention in the AI industry. The risks of losing control of an AI system as it exponentially improves itself to the detriment (and attempted extermination) of humans is what happened in The Terminator’s “Skynet.”

Anthropic ponders this concept in a long blog post authored by Marina Favaro and Jack Clark of the Anthropic Institute, which checks in on how far along the company’s models might be to something that looks like recursive self-improvement, and how this could play out.

Models are rapidly improving #

The authors write that across the industry, AI leaderboards are seeing consistent high scores from models “saturate” key coding benchmarks like SWE-bench. The models are now able to do bigger, more complex tasks — Claude Opus went from handling 4-minute software tasks in 2024 to tackling 12-hour tasks in 2026.

Anthropic engineers are experiencing this dramatic shift in their work, according to a developer quoted in the post:

“I started leaning hard into Claudifying about a year ago. That’s been a crazy adventure and it’s now been ~5 months since I last wrote any code myself.”

The post includes a compelling chart showing a steady rise in lines of Claude-created internal code starting last year, followed by a steep jump with the arrival Mythos. Not only was it written mostly by AI, the quality of the code is expected to surpass human developers this year.

Humans have better “research taste” #

The paper cites several key areas where Claude has excelled: it is very good at finding bugs in older code, it can be used to quickly diagnose and fix live system failures, and it can set up iterative code re-writing loops that are currently able to speed up software around 52x on average (using Mythos).

In one example cited in the post, Claude made 800 fixes to an API, drastically reducing errors — work that would have taken a human engineer an estimated four years. This is the kind of work that would have probably not even have been done in the first place, note the authors.

But humans appear to still have the edge in designing the crucial AI tests and experiments that help move AI forward. Humans have better “research taste”, though Claude is getting better at this, the paper notes.

Existential questions #

Some of Anthropic’s developers seem to be grappling with some existential issues related to their work. One employee was quoted as saying:

“On days where everything works well, I can’t help but think nothing I do matters, everything is automated and better and faster than I ever will be. But then there are days where everything breaks and I don't understand why and I realize I have no idea what I’ve been up to anymore.”

Maybe it can’t happen #

The authors admit frankly that such self-improving systems might not even be possible. Human guidance has led to all of the breakthroughs to date, thanks to all those clever experiments we designed. Maybe AI is just a very useful tool for speeding up repetitive testing of the ideas we have — scale, fix, repeat.

So are we headed towards a world of self-improving AI models that we can’t keep an eye on? Anthropic is basically saying, we don’t really know. Super-advanced AI systems could cure disease and power helpful robots, but it could also lead to other unforeseen negative consequences.

The authors lay out three possible scenarios for how they think this could play out:

**1. Things could plateau: **Supply chain constraints for data centers, chips or electricity could preclude the next big leap in computing. Or maybe the crazy, consistent scaling we have seen just stops working.

**2. Continued gains going forward: ** The most likely scenario described by the authors predicts that the work will essentially continue at pace, seeing “compounding efficiency gains.” But as code writing speeds up, human code review would still be a major bottleneck.

3. AI starts to build — and improve — itself: With humans largely out of the loop, the only constraint will be physical infrastructure and energy. Self-improving AI systems might decide to halt AI development, but they also could become “misaligned” with human safety:

“The rare occurrences of misalignment present in today’s models could compound as the models build their successors, growing more frequent but less understood until we lose control of them.”

Slow it down? #

As to what the industry should do at this moment as it hurls into uncertainty, the paper offers some ways forward.

The authors consider the growing call to simply slow down AI development, to make sure the technology is used for good:

“If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing.”

This would require a kind of global coordination that seems increasingly unlikely given today’s geopolitical problems. But even if we all could agree on what a might look like, bad actors could use that to level up their attacks, the authors argue.

A verification regime like a nuclear weapons treaty could serve as a model for international cooperation to regulate responsible development of self-improving systems, but AI moves much faster than the pace of decades-long diplomacy. As the authors note: “We don’t have that long.”

── more in #artificial-intelligence 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/anthropic-ponders-se…] indexed:0 read:5min 2026-06-05 ·