The close date will be extended until an AI model achieves a performance equal to or greater than 80%, on either FrontierMath by EpochAI (https://epoch.ai/frontiermath), or Humanity's Last Exam by safe.ai (https://lastexam.ai/).
Resolution source for the Last Exam:
This resolution will use https://scale.com/leaderboard/humanitys_last_exam as its source, if it remains up to date at the end of 2025. Otherwise, I will use my discretion in determining whether a result should be considered valid. Obvious cheating would not be considered valid.
Resolution source for FrontierMath:
EpochAI statements/information on their website.
See also:
/Manifold/what-will-be-the-best-performance-o-nzPCsqZgPc
@Bayesian I would agree with that actually. Think both are set high enough that significant events within the Trump term and AI breaking into the public sphere both come first, which make this very hard to have high confidence on.