Anthropic's Quiet Retreat from a Core Safety Vow Leaves AI Watchdogs Wary

In a move sending ripples through the tech world, Anthropic has revised a foundational safety pledge, softening a key commitment that once defined its brand as the conscientious builder in artificial intelligence. The company, founded by ex-OpenAI researchers, had promised to halt deployment of any AI model showing signs it could enable catastrophic harm. That hard line is now blurred.

The change centers on Anthropic’s Responsible Scaling Policy (RSP), first introduced in 2023. The original framework included specific, high-risk capability thresholds—related to bioweapons, cyberattacks, or self-replication—that would trigger an automatic pause in deployment. The updated policy, reported by Time and published with little fanfare, swaps those firm obligations for flexible, principle-based guidance. Where it once said the company 'will not' deploy without safeguards, it now employs language that grants executives significant discretion.

Anthropic defends the rewrite, stating early inexperience with AI development necessitated a more adaptable framework. CEO Dario Amodei has previously articulated the tension between safety and the fierce competition to lead the field, suggesting that losing a technological edge to less cautious rivals could itself be a risk.

That competitive pressure is palpable. With rivals like OpenAI, Google DeepMind, and Meta advancing rapidly, and with Anthropic itself backed by over $7 billion from investors including Amazon and Google, the stakes for staying at the frontier are immense. A voluntary pause could mean losing top talent and market position.

The revision has drawn sharp criticism from safety researchers who championed Anthropic’s original policy as an industry model. Their concern is twofold: it weakens a leading example of corporate self-governance and arrives as U.S. AI regulation remains stalled. California’s proposed safety bill was vetoed last year after industry pushback, leaving voluntary measures as a primary guardrail.

This follows a pattern. OpenAI, started as a nonprofit, has restructured around commercial aims. As AI firms scale, their pioneering safety vows often face erosion against market forces. Anthropic’s shift, while not abandoning safety work, changes a transparent rule into an opaque judgment call. It asks for trust that the company will choose safety over commercial pressure when it matters most—a test the entire industry is now watching.