What will be o3's score on FrontierMath?

Ṁ1kṀ21k

resolved May 14

100%98.6%

Less than 30%

0.8%

30% - 35%

0.2%

35% - 40%

0.2%

40% - 45%

0.2%

45% - 50%

0.1%

At least 50%

OpenAI has announced a model named o3. What will be the score of this model on FrontierMath?

Resolution is based on the score OpenAI publicly claims for o3 after its release. If there are multiple scores (e.g. for various levels of inference-time compute), the highest one will be used. Tool usage, including running Python and accessing the web, is allowed.

If OpenAI makes no claims about o3's score within two weeks of release, I'll use my best judgment.

I will trade on this market.

Note: There have been prior claims about o3 achieving a score of 25.2% on FrontierMath. However, note that this market is concerned about claims made in association with the public deployment of (a possibly further refined version of) o3; it's plausible that these scores are much higher, and hence a market on this is of interest. The prior 25.2% claim is irrelevant for the resolution of this market.

Note: EpochAI has a holdout subset of the FrontierMath benchmark. This is not within the scope of this market. That is, if both OpenAI and EpochAI announce scores for o3, I will resolve based on the OpenAI score.

For reference, if this market had been about o3-mini rather than o3, this market would have resolved 32%, based on the information in OpenAI's blog post.

Market context

Competition Math

Math

AI Benchmarks

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ1,161
2		Ṁ1,131
3		Ṁ272
4		Ṁ132
5		Ṁ53

People are also trading

Gemini 3 Deep Think Frontier Math 1-3 Score

What will be the best FrontierMath Tier 4 score by Dec 31, 2026?

By when will AI score >= 80% on FrontierMath

Before what year will Al achieve 85% or higher score on the FrontierMath benchmark?

In what year will Al achieve 95% or higher score on the FrontierMath benchmark?

2030

In what year will Al achieve 95% or higher score on the FrontierMath benchmark?

Highest Epoch-acknowledged FrontierMath score at EOY2026?

70.2

Will Al achieve 95% or higher score on the FrontierMath benchmark before 2030?

73% chance

Which of FrontierMath and Humanity's Last Exam will be saturated (>80%) first?

Best score on CritPt benchmark (~FrontierMath for physics) by end of 2026

Sort by:

bought Ṁ15 YES

people are way too confident about bucket 1

@Loppukilpailija OpenAI has released a model card but opted not to use FrontierMath. https://openai.com/index/introducing-o3-and-o4-mini/

Epoch's evals show it to be at 10%.

@MingCat Thanks. I will wait for the "two weeks of release" in case OpenAI gives results for FrontierMath.

It is unfortunate if we have to rely on the EpochAI results, since their scores are substantially lower than those claimed by OpenAI and so the comparisons are not apples-to-apples. But if no further information comes out, I suppose it's fair to assume that o3 hasn't substantially improved. Less than 30% seems fair in this case.