When will AI be able to solve 50% of official Jane Street puzzles?

1kṀ2565

2029

11%

2025

32%

2026

56%

2027

72%

2028

Resolved

2024

Every month for around the last 10 years, Jane Street (a trading firm) has released a difficult puzzle on their website: https://www.janestreet.com/puzzles/archive/.

Right now, the best publicly accessible AI (GPT-4 or Gemini Ultra) is not very good at this. I tried running the February puzzle through both and GPT-4 gave a few definitions and then said it was complex (though it did correctly simulate it afterward, even though the problem asked for an exact answer), and Gemini Ultra wasn't even close.

During which year will a publicly accessible AI be able to solve at least 6 of the 12 puzzles released during the year? (Resolves yes during each year this happens. Multiple years can resolve YES)

Clarifications

Must be a general-purpose AI model, not AlphaGeometry or something
Publicly accessible = reasonably accessible by an average interested member of the public
Puzzles must be solved with minimal human input, aside from maybe "Let's think step by step" or something. I want to basically just copy-paste the puzzle and have it give a solution.
The model is not allowed to search for the solution or copy from a similar puzzle, it must clearly be solving the puzzle.
Different AIs can solve different puzzles, as long as they are released before the end of the month of the puzzle they are solving and are still general-purpose. (If GPT-5 can solve all the puzzles and is released in October of this year, it can't retroactively count for the earlier puzzles)
Resolves N/A if the puzzles stop being published.

Technical AI Timelines

Get

1,000

to start trading!

People are also trading

AI solves Millenium Prize Problem in 2025?

2% chance

Will an AI solve any important mathematical conjecture before January 1st, 2030?

80% chance

In what year will AI achieve 90% of progress in TextQuests benchmark?

Will some Millennium Prize Problem be solved by AI before 2032?

33% chance

Will any AI solve more than four of AI 2027 Marcus-Brundage tasks in 2025?

28% chance

Will a Millennium Prize problem be solved in the year 2025 with the help of AI?

3% chance

Which Millennium Prize problem will be solved in the year 2025 with the help of AI?

Will AI solve 100% of solvable MTurk problems by July 2028?

32% chance

Will AI pass the Winograd schema challenge by the end of 2025?

86% chance

Will any of the remaining Clay Millenium problems be solved with substantial help from an AI before 2030?

Sort by:

Wouldn't JS just tweak the puzzles if AI could easily solve them? What would be the point of publishing them and keep a list of solvers etc?

@Lorenzo What tweak do you think would work for this? The humans still need to be able to solve them. You probably get less interesting puzzles if you’re optimizing for AI not being able to solve them, also

@dominic Check if a puzzle is solvable by AI, if so, tweak it a bit until it isn't.

> You probably get less interesting puzzles if you’re optimizing for AI not being able to solve them

Eh, idk, maybe people will find them less interesting if AIs can solve them

reposted

IMO this would require some really high quality planning & CoT architecture that doesn’t seem achievable for a general public AI model in the next 1-2 years. E.g. the Feb 2025 puzzle requires (1) a hypothesis about how the features of the puzzle are related, (2) a lot of exploration to find “the trick” of connecting the clues to the answers, and (3) an intuition for how to stitch the answers together to derive the final puzzle answer. Right now LLM/transformer-based models just don’t seem to have the creative knack to solve more than half of these kinds of problems. Could be wrong.

@pricemaker It's tough, but I do think the advent of reasoning models in 2024 helped the models go from "completely hopeless" to "making some genuine attempts". So who knows what the next generation will be able to do.

o1-mini and o1-preview both fail the most recent puzzle. But they do reasonable enough things, and are good enough at math, that I am not confident full o1 won’t be able to solve it - I think most of these are now underpriced, though I won’t trade because it’s my question

where's 2028+

@ZoravurSingh This is unlinked MC, so if you don’t think it will happen before 2028 you can bet NO on the pre-2028 options. I didn’t want to add too many years because I think it’s more difficult to have a good prediction the farther out you go

Can it run code like in Code Interpreter?

@ahalekelly Sure. It can't be a back and forth thing with the human, but it can use Code Interpreter. I wouldn't expect it to help a ton though, I think the puzzles are mostly not easily brute forceable in that way