What if Anthropic unilaterally paused capabilities development right now?

Anthropic argues that a temporary pause in frontier AI development is necessary for safety, but claims a unilateral halt by one lab would be ineffective because less cautious competitors would simply take the lead. However, a unilateral pause by Anthropic could instead pressure other labs like OpenAI and Google DeepMind to follow suit, while signaling urgency to governments and potentially accelerating global regulation efforts.

In their new post on recursive self-improvement https://www.anthropic.com/institute/recursive-self-improvement , Anthropic argues that a pause in frontier AI development is needed, but unfortunately, they can't pause on their own, because of less cautious actors: We believe it would be good for the world to have the optionto slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology.... A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. ... None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies e.g., the Intermediate-Range Nuclear Forces Treaty —but those regimes took decades to build both the infrastructure and the trust. We don’t have that long. A unilateral pause by one lab, by contrast, is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing. As many have pointed out, this reads a lot like lip service. But it sounds plausible: Anthropic seems to be the most safety-concerned lab right now, so the future would look worse if they weren't in the lead anymore because they paused unilaterally and a less cautious actor overtook them, right? I think this is fundamentally wrong, because it ignores many of the actual or possible effects of a unilateral pause. Mythos seems to have been a wake-up call for many, especially in governments around the world. For example, in response to Mythos, the president of the German Bundesamt für Sicherheit in der Informationstechnik, Claudia Plattner, called for a German AI Safety Institute https://www.bsi.bund.de/DE/Service-Navi/Presse/Alle-Meldungen-News/Blog/KI-Modelle neue Zeitrechnung 260508.html - something I have always thought was necessary, but wouldn't have deemed very likely before. It probably weren't the hacking capabilities of the new model alone that caused such a stir, but rather the fact that Anthropic chose to not publish it immediately and instead launched Project Glasswing. This could be seen as a clever PR stunt in the wake of the planned IPO, but I believe it was the correct thing to do and was mainly driven by real concerns. The decision to not publish a new model, thereby possibly giving up some revenue and market share, was a very strong signal that caused a lot of discussions and change in the political landscape. Now assume that Anthropic would unilaterally declare that they pause capabilities development, say, for three months, and instead put every resource they have into advancing AI safety for that time. They even offer options for outsiders to verify this. They publish a statement declaring that there is a significant risk now of accidentally creating an uncontrollable AI and they ask the other labs to pause development as well and join forces to improve AI safety techniques. This is of course a highly speculative scenario, but I think this would put enormous pressure on OpenAI and Google-Deepmind to follow their lead. After all, both Sam Altman and Demis Hassabis have said things like "if the others stop, we would stop too" in the past. It would be another wake-up call for politicians, making it very clear that the AI race is a real threat to humanity and regulation is urgently needed. Other labs, like Meta, X.AI and the Chinese, might be less inclined to follow suit. But I think the danger of them catching up signficantly in such a short time is low. The Chinese government has indicated in the past that they are willing to regulate AI development, so this could even open a window of opportunity for starting serious talks about global regulation. Would this move hurt Anthropic's IPO plans? Maybe, but I'm not sure. In the past, whenever they did something that seemed to hurt their revenue, like resisting the push by the Secretary of War to accept "any legal use" for Claude, it seems to have helped them more than hurt them. Anthropic is now seen as the "adult in the room", the most trustworthy and the most valuable AI lab. A unilateral pause may convince at least some investors that they are to be taken serious even more. Of course, given that they acknowledge How the alignment problem gets solved—or not—in this future is something we are least certain about. from a moral standpoint, a unilateral pause would be the only correct move in my opinion.