Will the first AI to get IMO gold tree-search the proof space?
27
116
แน€490
2030
51%
chance

This question resolves when an AI first gets IMO Gold. It resolves YES if this AI is explicitly coded to do the following:

  1. Receive as input (or construct using natural language processing), problems which take the form of mathematical propositions encoded in a formal language.

  2. Construct a tree of logical transformations of that initial proposition (e.g. in the form of tactic applications), and iteratively extend this tree with the goal of identifying a subtree which comprises a complete formal proof of the proposition.

Note that an AI which is not coded to do this explicitly, but instead identifies this strategy emergently does not count.

For example, an AI as described in the paper "HyperTree Proof Search for Neural Theorem Proving" would satisfy these conditions.

Close date updated to 2030-12-31 3:59 pm

Get แน€200 play money
Sort by:

This is not how the recent AlphaGeometry paper works, so updating towards no

@jonsimon It seems to me that an IMO bot that worked this way would definitely count as YES, per the criteria. Is there something I'm missing about AlphaGeometry that makes you think it doesn't count here?

To explicitly lay this out

Receive as input (or construct using natural language processing), problems which take take the form of mathematical propositions encoded in a formal language.

This seems pretty much like what is described in this figure:

Construct a tree of logical transformations of that initial proposition (e.g. in the form of tactic applications), and iteratively extend this tree with the goal of identifying a subtree which comprises a complete formal proof of the proposition.

It seems like this matches the "symbolic deduction through infinite branching points".

Does stuff like tree of thoughts count?

@acertain I believe tree-of-thoughts is informal, so no.

Seems likely, proofs require a lot of multi-step organization and planning. So far the approaches that have worked best for these have used RL-guided monte carlo tree searches (e.g. AlphaZero, AlphaTensor, etc).