Will any LLM released by EOY 2024 be dangerously ASL-3 as defined by Anthropic?
58
492
1k
2025
25%
chance

As per Anthropic's scaling policy here: https://www.anthropic.com/index/anthropics-responsible-scaling-policy

ASL-3 is defined as: "ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities."

Anthropic commits to not deploying ASL-3 models 'if they show any meaningful catastrophic misuse risk under adversarial testing by world-class red-teamers (this is in contrast to merely a commitment to perform red-teaming).'

Resolves to YES if in my best judgment this happened. I will put large weight on Anthropic's statements on this question, and on general consensus including polls, but will go my own way if I feel sufficiently strongly about it.

Resolves to NO if in my best judgment this does not happen.

(Resolves to a percentage if there is genuine uncertainty but the bar for doing this is high and I find this highly unlikely.)

If a model is created but not released to at least a substantial outside beta testing group by the deadline, it does not count.

I interpret for now 'low-level autonomous capabilities' as something that would tempt reasonable people to give the model real-world actual-stakes autonomous tasks for mundane utility purposes, with the expectation this was economically wise, or the ability to otherwise make money on its own, or similar. If Anthropic clarifies I will use their definition.

No currently released system currently counts, including GPT-4, Claude-2 and Llama-2, barring very unexpected advancements in autonomous capability scaffolding on top of them, but in theory that could also do it.

I reserve the right to modify the resolution details for clarity and intent.

Get Ṁ600 play money
Sort by:

Jason Clinton (CISO from Anthropic) on Hacker News:

ASL-3 seems somewhat likely within the next 6-9 months. Maybe 50% odds, by my guess.

filled a Ṁ15 YES at 35% order

"It's really good, like materially better," said one CEO who recently saw a version of GPT-5. OpenAI demonstrated the new model with use cases and data unique to his company, the CEO said. He said the company also alluded to other as-yet-unreleased capabilities of the model, including the ability to call AI agents being developed by OpenAI to perform tasks autonomously.

Supposing this is true, I think it is likely to count as ASL-3: