Will any LLM released by EOY 2025 be dangerously ASL-3 as defined by Anthropic?

Ṁ1.6kṀ18k

resolved May 23

Resolved

YES

ALL

Given the 2024 market saw interest but opened up very low, I figured I'd open up 2025 as well.

As per Anthropic's scaling policy here: https://www.anthropic.com/index/anthropics-responsible-scaling-policy

ASL-3 is defined as: "ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities."

Anthropic commits to not deploying ASL-3 models 'if they show any meaningful catastrophic misuse risk under adversarial testing by world-class red-teamers (this is in contrast to merely a commitment to perform red-teaming).'

Resolves to YES if in my best judgment this happened. I will put large weight on Anthropic's statements on this question, and on general consensus including polls, but will go my own way if I feel sufficiently strongly about it.

Resolves to NO if in my best judgment this does not happen.

(Resolves to a percentage if there is genuine uncertainty but the bar for doing this is high and I find this highly unlikely.)

If a model is created but not released to at least a substantial outside beta testing group by the deadline, it does not count.

I interpret for now 'low-level autonomous capabilities' as something that would tempt reasonable people to give the model real-world actual-stakes autonomous tasks for mundane utility purposes, with the expectation this was economically wise, or the ability to otherwise make money on its own, or similar. If Anthropic clarifies I will use their definition.

No currently released system currently counts, including GPT-4, Claude-2 and Llama-2, barring very unexpected advancements in autonomous capability scaffolding on top of them, but in theory that could also do it.

I reserve the right to modify the resolution details for clarity and intent, it will retain the same criteria as the 2024 version (and if that resolves YES so does this).

EOY 2024 Version of this market is here: https://manifold.markets/ZviMowshowitz/will-any-llm-released-by-eoy-2024-b

Will any LLM released by EOY 2024 be dangerously ASL-3 as defined by Anthropic?

13% chance. As per Anthropic's scaling policy here: https://www.anthropic.com/index/anthropics-responsible-scaling-policy ASL-3 is defined as: "ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities." Anthropic commits to not deploying ASL-3 models 'if they show any meaningful catastrophic misuse risk under adversarial testing by world-class red-teamers (this is in contrast to merely a commitment to perform red-teaming).' Resolves to YES if in my best judgment this happened. I will put large weight on Anthropic's statements on this question, and on general consensus including polls, but will go my own way if I feel sufficiently strongly about it. Resolves to NO if in my best judgment this does not happen. (Resolves to a percentage if there is genuine uncertainty but the bar for doing this is high and I find this highly unlikely.) If a model is created but not released to at least a substantial outside beta testing group by the deadline, it does not count. I interpret for now 'low-level autonomous capabilities' as something that would tempt reasonable people to give the model real-world actual-stakes autonomous tasks for mundane utility purposes, with the expectation this was economically wise, or the ability to otherwise make money on its own, or similar. If Anthropic clarifies I will use their definition. No currently released system currently counts, including GPT-4, Claude-2 and Llama-2, barring very unexpected advancements in autonomous capability scaffolding on top of them, but in theory that could also do it. I reserve the right to modify the resolution details for clarity and intent.

Market context

Anthropic RSP

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ728
2		Ṁ607
3		Ṁ298
4		Ṁ293
5		Ṁ166

People are also trading

Will the most interesting AI in 2027 be a LLM?

79% chance

Will LLMs Daydream by EOY 2026?

17% chance

Will LLMs become a ubiquitous part of everyday life by June 2026?

90% chance

When will Anthropic reach or surpass ASL-4 [metaculus]

Will an LLM Built on a State Space Model Architecture Have Been SOTA at any Point before EOY 2027? [READ DESCRIPTION]

39% chance

Will Anthropic release a (competetive) opensource LLM in the next 3 years?

30% chance

Will there be a state-of-the-art LLM that is NOT based on next raw token prediction before 2029?

55% chance

At the beginning of 2028, will LLMs still make egregious common-sensical errors?

67% chance

Which High-risk threshold as defined by OpenAI will be reached first by an LLM, whether or not that LLM is released?

9 Comments

55 Holders

145 Trades

Sort by:

bought Ṁ2,750 YES

@ZviMowshowitz resolves YES

https://x.com/JasonDClinton/status/1925604585928106351

bought Ṁ50 NO

@HenriThunberg I think this doesn't resolve YES yet. Per Anthropic:

We are deploying Claude Opus 4 with our ASL-3 measures as a precautionary and provisional action. To be clear, we have not yet determined whether Claude Opus 4 has definitively passed the Capabilities Threshold that requires ASL-3 protections. Rather, due to continued improvements in CBRN-related knowledge and capabilities, we have determined that clearly ruling out ASL-3 risks is not possible for Claude Opus 4 in the way it was for every previous model, and more detailed study is required to conclusively assess the model’s level of risk.

My read of this market is "will a model released by Anthropic have ASL-3 level capabilities", not "will a model released by Anthropic be subject to ASL-3 level safety measures".

@ZviMowshowitz I'd be curious to hear you address this comment -- as far as I know, we don't have enough evidence yet to resolve this YES.

@EricNeyman I think f they have to deploy the ASL-3 precautions that makes it ASL-3., that's how actual danger works!

what does this market mean? that an ASL-3 model is released, or an ASL-3 model is released that would be prohibited by Anthropic's deployment criteria?