Will OpenAI's next-gen math-focused model score at least 95% on the MATH benchmark?

18

160

Ṁ595Ṁ370

2030

71%

chance

1D

1W

1M

ALL

Amount

Ṁ1

Ṁ10

Ṁ100

Payout if YES

Ṁ14 +41%

New probability

71%

Resolve to YES if OpenAI's next generation math-focused model achieves a score of 95% or higher on the MATH benchmark.

If the next generation of general models (e.g. GPT-4), code models (e.g. Codex), or any other models specialized for reasoning are released earlier than the math models and score 95% or higher, it will resolve this question to YES.

Benchmarking on a subset of MATH is acceptable.

Using tools(e.g. calculator) & code is allowed.

Get Ṁ1,000 play money

## Related questions

By which years will AI be shown to have a better log loss than the Metaculus community pred. on <= 1 year predictions?

[OpenAI Startup Fund] Which ideas will OpenAI invest in as part of Converge Two?

Who will own the model at the top of the LMSYS Org Chatbot Arena Leaderboard at the end of March, 2024?

Will any open-source model rank higher than GPT-4 on ChatBot Arena in 2024? (according to ELO Rating)

85% chance