What will be true about AI and MIT Mystery Hunt 2026?
11
1.1kṀ486
Feb 19
69%
A team announces that AI solves any puzzle
43%
A team with a Google model has the “best” results
43%
A team announces that an AI independently solved at least 10 hunt puzzles regardless of order and standard puzzle unlock rules
42%
A team using an OpenAI model has the “best” results
37%
A team announces that an AI independently solved at least 1 entire hunt round (metapuzzle and any feeder puzzles whose answers were used to solve the meta). Does not have to be an opening round
35%
A team announces that an AI solved at least 1 hunt metapuzzle (regardless of whether it solved any earlier puzzles or associated feeders)
35%
A team announces that an AI independently solved at least 5 hunt puzzles *following* standard hunt unlock rules (can only skip puzzles or do free unlocks if other teams given the same option). A free unlock is not a solve
33%
At least three groups or teams publish writeups of testing/training/benchmarking/etc. AI on the hunt
29%
A team announces that an AI independently solved at least 20 hunt puzzles, regardless of order and standard puzzle unlock rules

I won’t bet.

All criteria must be met by market close to count.

I realize “best” is subjective. If it’s sufficiently unclear to me I may N/A.

I’m allowing humans to do some work with inputting puzzles, clicking on the website, taking screenshots, etc, but for a group’s writeup to count the AI should be doing the vast majority of the work on any puzzles it is claimed to have solved.

Clarification questions welcome.

I’ll try to be generous in accepting what teams report.

I may N/A some markets if it is unclear whether they have been met or if I realize belatedly the criteria are too poorly defined.

Market context
Get
Ṁ1,000
to start trading!
Sort by:

@JimHays I don’t have any special knowledge, but I’m surprised this is so low. But maybe some groups won’t publish writeups or they will take too long to come out?

@JimHays Maybe I misunderstood what testing/training/benchmarking meant.

@Eliza I should have used the exact language from the registration form but this is what I was intending to refer to:

It would likely be a lot more work to do a thorough attempt on this compared to the Putnam or IMO, and maybe results won’t be impressive enough for official teams to announce results, so whatever we get might just be from hobbyists?

@JimHays This is back-channel conversation that maybe I'm not supposed to repeat, but as of mid-December zero teams that had registered have said that they were doing so for AI training or benchmarking. I'm somewhat bullish on the possibility of AI puzzle solving (though I doubt models right now are up to the task), but that's why I have so many NO bets in this market.

Some of these options probably track with "Will there be a round of fish puzzles this year?" but I don't really have a good sense for the odds of that.

@JimHays I only bought No shares! I can't risk everything. The entire site already thinks I'm the anti-AI crusader.

© Manifold Markets, Inc.TermsPrivacy