Will any AI model achieve > 40% on Frontier Math before 2026?

104

Ṁ1kṀ83k

resolved Dec 13

Resolved

YES

ALL

The model need not be released

Update 2025-09-19 (PST) (AI summary of creator comment): - Resolution will be based on Epoch's reported Frontier Math scores. Other sources (e.g., AI Digest or lab-only reports) will not determine resolution.

Market context

Technology

Technical AI Timelines

OpenAI

AI Impacts

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ5,907
2		Ṁ3,631
3		Ṁ2,005
4		Ṁ1,614
5		Ṁ1,042

People are also trading

Will any AI model score above 90% on the ARC-AGI-2 benchmark before April 2026?

14% chance

Will there be a significant advancement in frontier AI model architecture by end of year 2026?

23% chance

Will an AI model solve a FrontierMath Open Problem before 2027?

87% chance

Which AI company first solves FrontierMath 85%?

Will a publicly known AI model achieve an 80% time horizon that is an 1 hour and 30 minutes by September 2026?

90% chance

Benchmark Gap #8: Once a single AI gets >= 80% on FrontierMath Tier 4, how long until an AI publishes a math paper?

What will be true of the SOTA AI on the FrontierMath benchmark, before 2027?

What will be the best FrontierMath Tier 4 score by Dec 31, 2026?

Will AI be better every human at proving Math theorems by the end of 2030?

18% chance

Will a new lab create a top-performing AI frontier model before 2028?

Sort by:

sold Ṁ207 NO

Not verified/posted on official page but GPT-5.2 high is showing 40.3 here https://epoch.ai/benchmarks/use-this-data

@TimDuffy Yeah, I see it on the official dashboard now.

bought Ṁ2,839 YES

@JaundicedBaboon Resolves YES.

bought Ṁ500 NO

https://epoch.ai/frontiermath Epoch

just posted evals and 5.2 only got 26.6%. Will leave unresolved for now in case that was the non-thinking version or the results are amended. It seems shockingly low

FrontierMath

FrontierMath is a benchmark of hundreds of unpublished and extremely challenging math problems to help us to understand the limits of artificial intelligence.

bought Ṁ150 NO

@JaundicedBaboon I'd wait to resolve since there's some small chance Epoch will evaluate Gemini 3 Deep Think, they haven't yet and I bit it would exceed 40 if they did. I'm also surprised at the low score!

bought Ṁ100 NO

The 26% is for 5.2 low, high could be much higher actually!

sold Ṁ106 NO

5.1 scored: 17.3% low, 26.9% med, 31.0% high.
If 5.2 has the same low/high gap, it will be right at 40.

@TimDuffy plus they will test it at extra high not high since thats new for 5.2-thinking

bought Ṁ25 NO

This will likely resolve yes but note that this market is based on Epoch's evaluation, I think the 40.3 we've seen is OpenAI's.

bought Ṁ100 NO

Previously OpenAI evaluated o3 and scored 25.3, Epoch evaluated it and scored it 18.7.

bought Ṁ75 NO

IIRC Epoch hasn't evaluated Gemini 3 Deep Think though, if they do before EOY I think that model is likely to exceed 40%.

bought Ṁ949 YES

Well fuck me 😅

https://epoch.ai/blog/deep-think-math

bought Ṁ200 YES

Epoch reported long ago that Agent 1 scored 49% at original FrontierMath (now tier 1-3) with pass@16.

https://x.com/EpochAIResearch/status/1945905802998423867

Does this count?

@qumeric Pass@16 should definitely not count... If it did, why not pass@32 or pass@64? It's clear that this market is about pass@1.

Why is this so different from this market? Are both based on FrontierMath Tiers 1-3? https://manifold.markets/SG/top-frontiermath-score-in-2025

Resolution will be based on Epoch's reported Frontier Math scores.

Historically openai reported 32% for o3-mini with python (which counts for the purpose of that other market afaict), but Epoch testing it with the general / minimal scaffold got 11.03%. Likely isn't because OpenAI is making up numbers or whatever but they demonstrably have a different setup

@JaundicedBaboon does this resolve according to AI Digest (which includes e.g. lab-reported scores) or according to Epoch’s evaluation?

@bh I’ll go by what Epoch reports

opened a Ṁ500 NO at 45% order

@Bayesian Limit up at 45% ;)

@BrunoJ i can uh... get a better price if i wait... 😭

opened a Ṁ3,000 YES at 51% order

All it would take is running the IMO model on Frontier Math.

bought Ṁ500 NO

@VinceVatter FrontierMath is orders of magnitude harder than IMO.

@traders 116 days until 2026! is a breakthrough expected over the next 4 months? Given the size of the jump from GPT-4 to GPT-5, I'm not sure why this is at 55%. I'm going to keep buying a little bit more NO every day.

boughtṀ250NO