BIG-bench accuracy 75% #2: Will SOTA for a single model on BIG-bench pass 75% by the start of 2025?

MANIFOLD

Ṁ170Ṁ98

resolved Jan 9

Resolved

N/A

ALL

Benchmarks
Only the sub benchmarks that are scored as an accuracy (i.e. from 0-100%) will be included (I think that's all of them but I'm not sure)
It must be a single model. If Model A achieves 75% on half and Model B achieves 75% on the other half that does not resolve the question YES
Ensemble models are fine but something like "run Model A on this benchmark and model B on this other benchmark" is not. If there is model selection is must be learned and it cannot include the current benchmark as an input.

Update 2025-05-01 (PST) (AI summary of creator comment): - If no BIG-bench results are available for any major models by the resolution date, the market will be resolved as N/A.
- NO will not be resolved based solely on SOTA results from 2023.
- YES will not be resolved based on personal predictions.

Market context

Technical AI Timelines

Get

1,000

to start trading!

Sort by:

I'm inclined to resolve this N/A - I can't find BIG-bench results for any major models currently. I think it would be extremely disingenuous to resolve NO based on SOTA results from 2023, but won't resolve YES based on my personal guess that this could be done. Has anyone been able to find recent BIG-bench results?

For this and the related BIG-bench markets: it seems like most groups are done publishing metrics on the individual tasks (as opposed to average score), and that they're mostly publishing on BIG-bench hard. If that's the case then my current plan is to resolve these markets N/A, and I'll make new ones asking about average score on BIG-bench hard.

People are also trading

BIG-bench accuracy 75% #4: Will SOTA for a single model on BIG-bench pass 75% by the start of 2027?

86% chance

BIG-bench accuracy 75% #5: Will SOTA for a single model on BIG-bench pass 75% by the start of 2028?

87% chance

MMLU 99% #4: Will SOTA for MMLU (average) pass 99% by the start of 2027?

8% chance

What will be true of the SOTA AI on the FrontierMath benchmark, before 2027?

MMLU 99% #5: Will SOTA for MMLU (average) pass 99% by the start of 2028?

44% chance

What will be true of the SOTA AI on the FrontierMath benchmark, before 2028?

People are also trading

Related questions