Will any LLM released by EOY 2025 be dangerously ASL-3 as defined by Anthropic?

1.6kṀ18k

resolved May 23

Resolved

YES

ALL

Given the 2024 market saw interest but opened up very low, I figured I'd open up 2025 as well.

As per Anthropic's scaling policy here: https://www.anthropic.com/index/anthropics-responsible-scaling-policy

ASL-3 is defined as: "ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities."

Anthropic commits to not deploying ASL-3 models 'if they show any meaningful catastrophic misuse risk under adversarial testing by world-class red-teamers (this is in contrast to merely a commitment to perform red-teaming).'

Resolves to YES if in my best judgment this happened. I will put large weight on Anthropic's statements on this question, and on general consensus including polls, but will go my own way if I feel sufficiently strongly about it.

Resolves to NO if in my best judgment this does not happen.

(Resolves to a percentage if there is genuine uncertainty but the bar for doing this is high and I find this highly unlikely.)

If a model is created but not released to at least a substantial outside beta testing group by the deadline, it does not count.

I interpret for now 'low-level autonomous capabilities' as something that would tempt reasonable people to give the model real-world actual-stakes autonomous tasks for mundane utility purposes, with the expectation this was economically wise, or the ability to otherwise make money on its own, or similar. If Anthropic clarifies I will use their definition.

No currently released system currently counts, including GPT-4, Claude-2 and Llama-2, barring very unexpected advancements in autonomous capability scaffolding on top of them, but in theory that could also do it.

I reserve the right to modify the resolution details for clarity and intent, it will retain the same criteria as the 2024 version (and if that resolves YES so does this).

EOY 2024 Version of this market is here: https://manifold.markets/ZviMowshowitz/will-any-llm-released-by-eoy-2024-b

Will any LLM released by EOY 2024 be dangerously ASL-3 as defined by Anthropic?

13% chance. As per Anthropic's scaling policy here: https://www.anthropic.com/index/anthropics-responsible-scaling-policy ASL-3 is defined as: "ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities." Anthropic commits to not deploying ASL-3 models 'if they show any meaningful catastrophic misuse risk under adversarial testing by world-class red-teamers (this is in contrast to merely a commitment to perform red-teaming).' Resolves to YES if in my best judgment this happened. I will put large weight on Anthropic's statements on this question, and on general consensus including polls, but will go my own way if I feel sufficiently strongly about it. Resolves to NO if in my best judgment this does not happen. (Resolves to a percentage if there is genuine uncertainty but the bar for doing this is high and I find this highly unlikely.) If a model is created but not released to at least a substantial outside beta testing group by the deadline, it does not count. I interpret for now 'low-level autonomous capabilities' as something that would tempt reasonable people to give the model real-world actual-stakes autonomous tasks for mundane utility purposes, with the expectation this was economically wise, or the ability to otherwise make money on its own, or similar. If Anthropic clarifies I will use their definition. No currently released system currently counts, including GPT-4, Claude-2 and Llama-2, barring very unexpected advancements in autonomous capability scaffolding on top of them, but in theory that could also do it. I reserve the right to modify the resolution details for clarity and intent.

Anthropic RSP

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ728
2		Ṁ607
3		Ṁ298
4		Ṁ293
5		Ṁ166

People are also trading

Will we get a new LLM paradigm by EOY?

34% chance

What will be true of OpenAI's best LLM by EOY 2025?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

26% chance

Thinking Machines releases an LLM by EOY 2025?

36% chance

Will there be an LLM which can do fluent conlang translations by EOY 2024?

72% chance

What will Manifolders mostly use LLMs for, by EOY 2025?

Will there be a major bioattack by EOY 2025 where an LLM provided relevant information to the attacker(s)?

9% chance

Will Anthropic announce one of their AI systems is ASL-4 or higher before the end of 2025?

8% chance

Will Apple release its own LLM on par with state of the art LLMs before 2026?

7% chance

Will an LLM Built on a State Space Model Architecture Have Been SOTA at any Point before EOY 2027? [READ DESCRIPTION]

Sort by:

bought Ṁ2,750 YES

@ZviMowshowitz resolves YES

https://x.com/JasonDClinton/status/1925604585928106351

bought Ṁ50 NO

@HenriThunberg I think this doesn't resolve YES yet. Per Anthropic:

We are deploying Claude Opus 4 with our ASL-3 measures as a precautionary and provisional action. To be clear, we have not yet determined whether Claude Opus 4 has definitively passed the Capabilities Threshold that requires ASL-3 protections. Rather, due to continued improvements in CBRN-related knowledge and capabilities, we have determined that clearly ruling out ASL-3 risks is not possible for Claude Opus 4 in the way it was for every previous model, and more detailed study is required to conclusively assess the model’s level of risk.

My read of this market is "will a model released by Anthropic have ASL-3 level capabilities", not "will a model released by Anthropic be subject to ASL-3 level safety measures".

@ZviMowshowitz I'd be curious to hear you address this comment -- as far as I know, we don't have enough evidence yet to resolve this YES.

@EricNeyman I think f they have to deploy the ASL-3 precautions that makes it ASL-3., that's how actual danger works!

what does this market mean? that an ASL-3 model is released, or an ASL-3 model is released that would be prohibited by Anthropic's deployment criteria?