This market resolves as soon as OpenAI's o1-pro is tested on FrontierMath by a reputable source. Currently, no model achieves even 2%, as noted on page 9 of https://arxiv.org/pdf/2411.04872
I will not bet in this market.
An introduction to FrontierMath can be found here https://epoch.ai/frontiermath
Update 2025-16-01 (PST) (AI summary of creator comment): - Closing Time: Extended until the release of o3.
Resolution Criteria:
If o1-pro is tested on FrontierMath by a reputable source before the release of o3, the market resolves accordingly.
If no benchmark results by the release of o3, the market resolves as N/A.
OpenAI claims 5.8% pass @1 for o1: https://openai.com/index/openai-o3-mini/
Those pass@4 imply this might be 10% or so.
@Usaar33 That's not precise enough for me to resolve. Unless something changes in the next few hours, I plan to resolve the market as N/A.