The first Anthropic-trained AI system to qualify as ASL-3 qualifies because of misuse risks rather than autonomy risks

1kṀ2044

10000

90%

chance

ALL

This will be evaluated according to the AI Safety Levels (ASL) standard v1.0 defined by Anthropic here, which gives two different ways that an AI system could qualify for AI Safety Level 3 (ASL-3). This resolves based on the first clear public disclosure by Anthropic that indicates that they have trained a model and found it to qualify for ASL-3.

If Anthropic announces a policy that would prevent this information from being disclosed, announces that it has permanently ceased developing new AI systems, or ceases to operate, this will resolve N/A after six months.

Update 2025-05-01 (PST): - Additional Resolution Criteria:
- Anthropic must claim to have passed their CBRN threshold before passing their AI R&D or 2–8h software engineering thresholds, aligning with the criteria used for the previous autonomy risks category. (AI summary of creator comment)

AI Safety

Anthropic

Anthropic RSP

Get

1,000

to start trading!

1 Comment

21 Holders

28 Trades

Sort by:

It looks like the RSP has been restructured a bit, but the evals persist more or less as is AFAICT. I'll resolve this yes if Anthropic claims to have passed their CBRN threshold before passing their AI R&D or 2–8h software engineering thresholds, which seem to match what was used for the old autonomy risks category.