When will AI be able to solve 50% of official Jane Street puzzles?
4%
2024
17%
2025
41%
2026
48%
2027
55%
2028

Every month for around the last 10 years, Jane Street (a trading firm) has released a difficult puzzle on their website: https://www.janestreet.com/puzzles/archive/.

Right now, the best publicly accessible AI (GPT-4 or Gemini Ultra) is not very good at this. I tried running the February puzzle through both and GPT-4 gave a few definitions and then said it was complex (though it did correctly simulate it afterward, even though the problem asked for an exact answer), and Gemini Ultra wasn't even close.

During which year will a publicly accessible AI be able to solve at least 6 of the 12 puzzles released during the year? (Resolves yes during each year this happens. Multiple years can resolve YES)

Clarifications

  • Must be a general-purpose AI model, not AlphaGeometry or something

  • Publicly accessible = reasonably accessible by an average interested member of the public

  • Puzzles must be solved with minimal human input, aside from maybe "Let's think step by step" or something. I want to basically just copy-paste the puzzle and have it give a solution.

  • The model is not allowed to search for the solution or copy from a similar puzzle, it must clearly be solving the puzzle.

  • Different AIs can solve different puzzles, as long as they are released before the end of the month of the puzzle they are solving and are still general-purpose. (If GPT-5 can solve all the puzzles and is released in October of this year, it can't retroactively count for the earlier puzzles)

  • Resolves N/A if the puzzles stop being published.

Get Ṁ600 play money
Sort by:

where's 2028+

@ZoravurSingh This is unlinked MC, so if you don’t think it will happen before 2028 you can bet NO on the pre-2028 options. I didn’t want to add too many years because I think it’s more difficult to have a good prediction the farther out you go

Can it run code like in Code Interpreter?

@ahalekelly Sure. It can't be a back and forth thing with the human, but it can use Code Interpreter. I wouldn't expect it to help a ton though, I think the puzzles are mostly not easily brute forceable in that way