The first Anthropic-trained AI system to qualify as ASL-3 qualifies because of misuse risks rather than autonomy risks
22
1kṀ2544
resolved Dec 3
Resolved
YES

This will be evaluated according to the AI Safety Levels (ASL) standard v1.0 defined by Anthropic here, which gives two different ways that an AI system could qualify for AI Safety Level 3 (ASL-3). This resolves based on the first clear public disclosure by Anthropic that indicates that they have trained a model and found it to qualify for ASL-3.

If Anthropic announces a policy that would prevent this information from being disclosed, announces that it has permanently ceased developing new AI systems, or ceases to operate, this will resolve N/A after six months.

  • Update 2025-05-01 (PST): - Additional Resolution Criteria:

    • Anthropic must claim to have passed their CBRN threshold before passing their AI R&D or 2–8h software engineering thresholds, aligning with the criteria used for the previous autonomy risks category. (AI summary of creator comment)

  • Update 2025-11-24 (PST) (AI summary of creator comment): The creator plans to resolve this market based on Opus 4.5, which they believe qualifies for ASL-3 for CBRN (misuse risks) significantly more clearly than prior models. The system card states "As a result, we determined ASL-3 safeguards were appropriate" in the CBRN section. The creator does not see clear evidence of qualification for autonomy or software engineering thresholds. Resolution will occur in 24 hours unless objections are raised.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ219
2Ṁ71
3Ṁ49
4Ṁ30
5Ṁ28
Sort by:

The system cards are fairly confusing around whether models have qualified, but on my reading, Opus 4.5 qualifies for ASL-3 for CBRN significantly more clearly than prior models, and I plan to count it. I don't see anything as clear about autonomy or software engineering. Comment in the next 24h if you want to argue for a different resolution.

In the CBRN section of the system card:
> As a result, we determined ASL-3 safeguards were appropriate.

It looks like the RSP has been restructured a bit, but the evals persist more or less as is AFAICT. I'll resolve this yes if Anthropic claims to have passed their CBRN threshold before passing their AI R&D or 2–8h software engineering thresholds, which seem to match what was used for the old autonomy risks category.

© Manifold Markets, Inc.TermsPrivacy