IMO 2025 is scheduled to take place between July 10 and July 20 this year. This market asks: will (at least some) members of the public be able, once the problems are posted online, to use an AI model to produce solutions that would win gold at the IMO?
"Publicly accessible" is here meant to be a relatively lenient condition: it is not required that everyone, or even most people be able to access it - it just needs to be known about by the public, and there must exist credible evidence that some people unaffiliated with the lab that trained the AI had been able to run it and produce the solutions.
If possible, I plan to defer to the Will an AI get gold on any International Math Olympiad by the end of 2025? market for the resolution - if it resolves YES based on some AI that satisfies the above requirements, then this resolves YES in turn; likewise for NO. In the case of some more complicated situation, where it might not be clear if some solution "counts", I'll try to seek community consensus on whether it does.
Update 2025-02-04 (PST) (AI summary of creator comment): Publicly Accessible Clarification
Inclusions: External tests such as the o3-mini safety testing (e.g. mid-Jan) are intended to fall under the "publicly accessible" criterion.
Exclusions: Internal tools like Alphaproof (e.g. internal Google tests) are not considered publicly accessible.
Intent: The goal is to minimize the flow of detailed IMO information to the AI that is being used to generate solutions.
(Reposting a comment (about this question) from another post)
Hmm, the intent behind the lenient "publicly accessible" is to include things like o3-mini external safety testing that happened in mid-Jan, but exclude things like Alphaproof (which AFAIK is just an internal Google thing?).
Sorry, I wasn't really clear enough. The intent/spirit of the market is to minimize the amount of information that could conceivably flow from the IMO to the model that gets tested on it. (Maybe I should have asked: if AI gets gold on IMO 2025, will its solutions be generated on July 16 and July 17? That seemed a bit less elegant to me, but now—)