Will SOTA on any major code benchmark go up at least twice this year?

190Ṁ1696

resolved Jan 1

Resolved

YES

ALL

Major code benchmarks include:

HumanEval
APPS
Performance on any major code competition (IOI, ICPC, the various competition websites)

A single benchmark needs to go up twice. So a single model that improves SOTA on HumanEval and APPS would not resolve the market YES. We need two different models that both get SOTA on the same benchmark.

Technical AI Timelines

New Year's Resolutions 2024

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ63
2		Ṁ13
3		Ṁ10
4		Ṁ8
5		Ṁ7

Comments

9 Holders

88 Trades

People are also trading

BIG-bench accuracy 75% #3: Will SOTA for a single model on BIG-bench pass 75% by the start of 2026?

86% chance

BIG-bench accuracy 75% #5: Will SOTA for a single model on BIG-bench pass 75% by the start of 2028?

87% chance

BIG-bench accuracy 75% #4: Will SOTA for a single model on BIG-bench pass 75% by the start of 2027?

86% chance

What will be true of the SOTA AI on the FrontierMath benchmark, before 2026?

MMLU 99% #3: Will SOTA for MMLU (average) pass 99% by the start of 2026?

6% chance

By 2026, will it be standard practice to sandbox SOTA LLMs?

28% chance

What will be true of the SOTA AI on the FrontierMath benchmark, before 2028?

What will be true of the SOTA AI on the FrontierMath benchmark, before 2027?

[Carlini questions] SOTA AI scores better than X% of other participants in competitive programming contest by 2027

91.5

MMLU 99% #4: Will SOTA for MMLU (average) pass 99% by the start of 2027?

8% chance

🏅 Top traders

People are also trading

Related questions