Is scale unnecessary for intelligence (<10B param human-competitive STEM model before 2027)?

1kṀ7299

2027

14%

chance

ALL

Resolves yes if before 2027, a neural net with <10B parameters achieves all of: >75% on GPQA, >80% on SWE-bench verified, and >95% on MATH

Arbitrary scaffolding allowed (retrieval over fixed DB is ok), no talking with other AI, no internet access. We'll allow up to 1 minute of time per question. We'll use whatever tools are available at the time to determine whether such an AI memorized the answers to these datasets; if verbatim memorization obviously happened, the model will be disqualified.

Technology

Technical AI Timelines

Science

AGI

Get

1,000

to start trading!

People are also trading

Is scale unnecessary for intelligence (<10B param human-competitive STEM model before 2030)?

72% chance

An AI model with 100 trillion parameters exists by the end of 2025?

6% chance

Will a single model achieve superhuman performance on all Atari environments by 2025?

14% chance

Will an AI model achieve superhuman ELO on Codeforces by the 31 December 2025?

48% chance

Limits on AI model size by 2026?

5% chance

Will scaling transformers lead to a 60% score on ARC-AGI-2?

69% chance

Conditional on no existential catastrophe, will there be a superintelligence by 2100?

89% chance

Conditional on no existential catastrophe, will there be a superintelligence by 2050?

72% chance

Conditional on no existential catastrophe, will there be a superintelligence by 2040?

65% chance

Conditional on no existential catastrophe, will there be a superintelligence by 2030?

Sort by:

Say hello to the future https://x.com/vitrupo/status/1930009915650912586

https://x.com/reach_vb/status/1881319500089634954

@JacobPfau Mismatch between title and description, 2027 vs 2030.

Also title says super-human, but based on the description, I doubt general consensus would even consider that AGI.

@SIMOROBO A system can be a multi-domain superintelligence without being AGI. I'd guess achieving the listed scores on these problems is a 1/1e6 or 1e7 feat for a human. Super-1:1e6-human is perhaps more precise, but I'll allow myself that much mis-specification in the title.

@JacobPfau I'm pretty sure you would have a hard time convincing anyone that a bipedal robot achieving a time of exactly 10 seconds at a 100-meter sprint is "super-human" and yet that's probably a better than 1:1e7 result for a human.

I have not seen data on human performance on SWE-Bench Verified but I would assume it's very possible for humans to get 100%. Once AI makes it to 100%, other factors such as speed can begin to make it qualify for the term super-human in my opinion.