Will LLMs be banned at the 2026 MIT Mystery Hunt?

1kṀ6403

Jan 17

10%

chance

ALL

The MIT Mystery Hunt is an annual puzzle hunt that takes place in January. Historically, there have been no restrictions on the use of technology for solving puzzles. Will at least some LLMs be banned from use at the 2026 MIT Mystery Hunt? If LLMs are no longer the dominant AI paradigm, I will resolve this question according to whether at least some of the successors of LLMs are banned.

Clarification (made 8/28/2025): a partial ban of an LLM (e.g. if some uses of an LLM are banned while others are not) would resolve YES.

Update 2025-04-23 (PST) (AI summary of creator comment): Partial bans clarified
- Partial or model-specific bans: If only specific LLMs (for example, those provided via paid subscriptions) are banned while others remain available, the outcome resolves as YES.
- Access distinction: Since paid subscriptions typically offer access to LLMs that free subscriptions do not, a ban affecting these constitutes a ban on at least some LLMs.
- Edge case handling: Even in the scenario where paid subscriptions do not grant access to new models (e.g., only offering lower latency or more queries), the resolution will still be YES.

Update 2025-11-27 (PST) (AI summary of creator comment): Purpose-based restrictions are insufficient: Banning teams that use LLMs for the purpose of training/benchmarking AI tools would not resolve this market to YES. The market requires restrictions on particular types of use (how LLMs are used), not just restrictions based on purpose (why they are used). A YES resolution requires that some way of using an LLM to solve puzzles is actually prohibited.

puzzles

LLMs

MIT Mystery Hunt

Get

1,000

to start trading!

People are also trading

Will LLMs become a ubiquitous part of everyday life by June 2026?

90% chance

What will be true about AI and MIT Mystery Hunt 2026?

Can LLM generate a Lonpos puzzle solution before the end of 2025?

18% chance

Will an LLM that someone is trying to shut down stop or avoid that in some way before 2026?

12% chance

Will a publicly-available LLM achieve gold on IMO before 2026?

91% chance

LLM Hallucination: Will an LLM score >90% on SimpleQA before 2026?

15% chance

Will the leading LLM at the beginning of 2026 still be subject to the reversal curse?

46% chance

Will a team consisting primarily of ML models complete MIT Mystery Hunt by 2030?

25% chance

Will the most interesting AI in 2027 be a LLM?

Sort by:

The 2026 MIT Mystery Hunt includes the following as part of registering a team:

If your team or members of the team are participating in the hunt with the goal of training, benchmarking, or any aspects of improving an AI tool, please let us know the details of this here.

Note: We may handle teams classified as such differently, including marking or omitting them from wrapup/stats, deprioritizing hint requests or other inquiries, or other actions to make sure that the presence of these parties do not worsen the experience for other teams.

To be fair, this doesn't rule for or against using LLM's in a "normal" capacity.

@phenomist Yeah, my instinct is that even banning such teams would be unsufficient for this question to resolve YES. That's because no particular type of use is restricted, just some kinds of purposes of use. It wouldn't preclude a team from fully using any LLM to solve puzzles.

bought Ṁ100 YES

Notable: the Galactic Puzzle Hunt forbids using LLMs to solve puzzles (though limited uses are allowed): "You may not ask an LLM or other generative AI to solve a puzzle or a significant part of one."

I'm pretty confused on how this question should resolve if the Mystery Hunt has a similar rule. I'm leaning toward ruling that forbidding LLMs in any capacity (like GPH did) counts as a YES resolution, but will accept arguments to the contrary for the next week. Consider this the final ruling if no one responds to this comment by a week from now.

@EricNeyman I’m also a bit torn about a rule like that. On the one hand this seems like an LLM-use Restriction rather than a Ban. On the other hand, you previously indicated that if some models are disallowed but not others (which also seems more like a restriction than a ban) that would also be YES.

My bets on NO have been along the lines: Mystery Hunt always encourages you to use the best tools available (aside from hacking the website, social engineering, etc.). So any kind of restriction in this regard would have surprised me (though GPH’s rule updates me toward YES). However when picturing a ban, I was also imagining that some of the things our team did last year would be made illegal, such as asking an LLM for potential pun ideas to try to help solve a pun-based puzzle, which would probably still be allowed

@JimHays Yeah, I agree that this is a restriction rather than a ban. I think the letter of the wording would suggest a NO resolution. On the other hand, the spirit of the question as I intended it is more of a benchmark: will the LLMs be so good that the Mystery Hunt will decide to make an additional rule that keeps competitors from crushing puzzles with LLMs? And according to that spirit, I think the resolution would be YES.

I'm curious if others have any thoughts. Tagging @Joshua and @Conflux since you guys have resolved a bunch of markets and might have thoughts.

@EricNeyman I wouldn’t be mad either way, and applaud your commitment to ironing this out now

@EricNeyman I agree with your thinking, and my gut is that the spirit of the market outweighs the hyperliteral written idea that it should he NO. “Historically, there have been no restrictions on the use of technology for solving puzzles” to me seems like a strong context clue that any restriction on the use of technology should be a YES. It’s the change in precedent this market was designed to consider! And it’s especially awesome you’re thinking about this in advance; if it was post facto I’d be a bit less comfy with YES, but here I am Team YES.

@EricNeyman Of interest for other people in this market:

/Eliza/how-many-manifold-users-will-report

opened a Ṁ250 NO at 25% order

@EricNeyman How would a partial ban resolve? E.g. only one specific model is disallowed, or paid subscriptions are disallowed but free options are ok?

@JimHays as per the description, this would resolve YES (it says "will at least some LLMs be banned").

Since paid subscriptions typically give access to LLMs that free subscriptions do not, that would also resolve YES. One edge case is if, at the time if the Mystery Hunt, no paid subscription actually gives access to new models (maybe just lower latency or more queries). I plan to still resolve YES in that case.