
The close date will be extended until an AI model achieves a performance equal to or greater than 80%, on either FrontierMath by EpochAI (https://epoch.ai/frontiermath), or Humanity's Last Exam by safe.ai (https://lastexam.ai/).
Resolution source for the Last Exam:
This resolution will use https://scale.com/leaderboard/humanitys_last_exam as its source, if it remains up to date at the end of 2025. Otherwise, I will use my discretion in determining whether a result should be considered valid. Obvious cheating would not be considered valid.
Resolution source for FrontierMath:
EpochAI statements/information on their website.
See also:
/Manifold/what-will-be-the-best-performance-o-nzPCsqZgPc
/Bayesian/what-will-be-the-best-ai-performanc
/Bayesian/will-o3s-score-on-the-last-exam-be
/MatthewBarnett/will-an-ai-achieve-85-performance-o--cash
/Bayesian/will-an-ai-achieve-85-performance-o-hyPtIE98qZ