Anthropic Updates Responsible Scaling Policy—A Flexible Safety System for Growing AI Models
Anthropic has published a significant update to its AI risk management policy (Responsible Scaling Policy). Instead of a one-size-fits-all approach, the…
AI-processed from Anthropic Blog; edited by Hamidun News
Anthropic has published an update to its Responsible Scaling Policy—a policy for managing risks in developing powerful AI systems. This significant update introduces a more flexible and nuanced approach while maintaining the core principle: the company will not train or deploy models until it is confident that risks are at an acceptable level.
Why the Policy Was Updated
A year ago, Anthropic released the first version of the RSP, designed to manage risks in growing AI systems. However, a year of practical implementation has shown the need for a more flexible approach. The company monitors not only the technical capabilities of models but also their potential risks and consequences.
Anthropic monitors several categories of threats simultaneously. These include classic problems like misinformation distribution, incitement to violence, and fraud—all covered by the company's Usage Policy. However, the RSP focuses on more ambitious catastrophic scenarios that could emerge when models reach certain levels of autonomy and ability for complex manipulation.
The updated policy draws on practical experience and approaches used in other high-risk industries—aviation, nuclear energy, pharmaceuticals. This allows better preparation for the accelerating pace of AI development and the building of safety systems that scale with the technology.
How ASL Levels Work
The new system is based on the principle of proportional safety: security measures should grow along with risks. Anthropic introduced AI Safety Level Standards (ASL Standards)—graduated sets of technical and procedural requirements inspired by international Biosafety Levels, used in laboratories for working with hazardous materials. The system starts with ASL-1 for models with basic capabilities (for example, specialized bots for chess or rapid information retrieval) and rises to ASL-2, ASL-3, and beyond as capabilities and potential risks grow. Each higher level requires stricter requirements:
- Enhanced monitoring and logging of all operations performed by the model
- Stricter pre-deployment safety testing
- Additional layers of access control, isolation, and segmentation
- Mandatory independent audits and checks by external safety experts
- More frequent reassessment of potential risks as new data emerges
Currently, all Anthropic models operate under the ASL-2 standard, which the company considers to reflect industry best practices today.
Capability Thresholds—When Greater Readiness Is Needed
Instead of vague and subjective criteria, Anthropic has defined concrete thresholds, or Capability Thresholds—specific model abilities that trigger stronger safety measures and a new ASL level when reached. So far, two key thresholds have been identified. The first is Autonomous AI R&D: if a model can independently conduct complex AI research tasks that typically require human expertise and intuition, this could potentially significantly accelerate AI development in unpredictable directions. The second threshold is mentioned in the original document but its full description has not yet been revealed. The company leaves open the possibility of expanding this list as it better understands the real-world impact of new capabilities in practice.
What This Means
Anthropic demonstrates that AI safety should be neither irrational blockade nor complete lack of control. Instead, the company is building a scalable system that grows with the technology and adapts to real-world risks. This approach is important for other developers as well. If Anthropic's ideas gain broad acceptance in the community, they could become a de facto industry standard. This is especially relevant for regulators who are now seeking practical frameworks for AI system oversight.
Want to stop reading about AI and start using it?
AI News is a curated feed of AI/tech news. Hamidun Academy teaches you to use AI systematically in your work.