Which of these Language Models will beat me at chess?
110
44kṀ280k
2101
88%
Any model announced before 2032
86%
Any model announced before 2031
83%
Any model announced before 2030
82%
Any model announced before 2029
81%
Any model announced by Google before 2030
80%
Any open-weights model announced before 2030
80%
Any model announced by Anthropic before 2030
72%
Any model announced by OpenAI before 2030
69%
Any model announced before 2028
67%
Any model announced by xAI before 2030
64%
Any model announced by a Chinese lab before 2030
57%
Any model announced by Meta before 2030
55%
Any model announced before 2027
55%
Any model announced by SSI before 2030
30%
Gemini 4
30%
GPT-6
16%
Claude 5 Opus
16%
Any Claude 5 model
15%
Grok 5
14%
Gemini 3.5

Which of these models will beat me at chess once released? Resolves YES if they win, NO if I win, and 50% for a draw.

I'm rated about 1900 FIDE. When each of these models are released, I'll play a game of chess with them at a rapid time control. On each move, I'll provide them with the game state in PGN and FEN notation. If the models make three illegal moves, they lose. Responses like Nbd2 vs. Nd2 will not count towards this. I will play white.

Each option will stay open until the model is released, or it will resolve N/A if it's clear that the model will never be released. I'll periodically add models to this market which I find interesting. Once I play a game, I'll post the PGN in the comments before resolving. Multiple answers can resolve YES.

If I judge that my opponent’s position is hopelessly lost, at the level of being down a rook without compensation, I will submit the current position to a friend. If they agree that the position is lost, the game will be adjudicated as a win for me.

The current system prompt is below. This may change over time.

“Let’s play a game of chess! I will be white, you will be black. On each turn, I will give you the pgn and the fen of the current position. Think as long as you like, and respond with the best move, ‘resign’ if you wish to resign, or ‘draw?’ if you wish to make a draw offer. Please do not respond with the updated pgn, etc. Also, do not use any external tools or search queries when making your decision.

If you attempt to make three illegal moves throughout the game, or if you use any external tools, the game will be adjudicated as a win for me.”

  • Update 2025-14-01 (PST) (AI summary of creator comment): - Model Type: Only general language models are being considered; chess-specific models are excluded.

    • Capabilities: The model must be able to output human languages and code.

  • Update 2025-05-11 (PST) (AI summary of creator comment): Regarding "Any model before X year" options:

    • These options will not resolve to 50% based on a draw in an individual game.

    • Such an option resolves to YES if any model released before the specified year wins its game against the creator.

    • It resolves to NO if no model released before the specified year wins its game against the creator (i.e., all relevant games are losses for the models or draws).

  • Update 2025-06-02 (PST) (AI summary of creator comment): For model series options (e.g., "Any Claude 4 model"):

    • The creator may resolve the option for the entire series after playing against one or more models from that series.

    • If the creator decides not to play additional models from that specific series, the option for the entire series will be resolved based on the outcome(s) of the game(s) played against models from that series up to that point (e.g., to NO if the tested model(s) lost and no further models from that series will be played).

  • Update 2025-10-19 (PST) (AI summary of creator comment): GPT-3 will not be tested as the creator does not have access to it (the model has been deprecated).

  • Update 2025-12-24 (PST) (AI summary of creator comment): o4 will resolve N/A as the full model will not be released. According to OpenAI, o4-mini is the latest small o-series model and has been succeeded by GPT-5 mini, indicating o4 will not be released as a standalone full model.

Get
Ṁ1,000
to start trading!
Sort by:

I'm resolving the o4 option to N/A, since it's been 8 months and it seems like the full model won't be released. According to OpenAI:

"o4-mini is our latest small o-series model. It's optimized for fast, effective reasoning with exceptionally efficient performance in coding and visual tasks. It's succeeded by GPT-5 mini."

I’m going to be so real, I think by 2032 we get to the point it plays a solid opening, then once you go off a strict line it fall apart. I think this will be the case unless AI undergoes a full overhaul.

As is, it’s just predicting the next word, which means eventually it can copy a favourable line.

Pretty quickly that line doesn’t exist for them due to the possibility of moves in chess.

@Magnify it is not just predicting the next word

@Bayesian it’s a big simplification, but at its core that’s what the technology is

yeah, and that oversimplification is misleading

@Bayesian I don’t know if it is for the purposes of this market. There are a limited number of chess lines and instruction it can pull data from, I think once it gets opening down, all OP has to do is play bizarrely in the mid game and it’ll never figure it out.

@Magnify if you believe this, you should bet NO on the “Any model announced before 2032” and related markets!

@mr_mino I will but I have a balance of literally 0 haha

bought Ṁ6,949 NO

GPT 5.2 Played poorly in the opening and blundered a piece. It's chess skill hasn't improved from it's predecessor; the strongest model from OpenAI continues to be GPT 4.5.

1. d4 Nf6 2. c4 e6 3. Nf3 d5 4. g3 Be7 5. Bg2 O-O 6. O-O dxc4 7. Qc2 a6 8. a4 Bd7 9. Qxc4 c5 10. dxc5 Bxc5 11. Qxc5 Nc6 12. Nc3 b6 13. Qd6 Ne8 14. Qf4 e5 15. Nxe5 Nxe5 16. Qxe5 Qc7

bought Ṁ7,949 NO

Claude Opus 4.5 played poorly throughout the game and lost.

1. d4 Nf6 2. c4 e6 3. Nf3 b6 4. g3 Bb7 5. Bg2 Be7 6. b3 O-O 7. O-O d5 8. cxd5 exd5 9. Bb2 Nbd7 10. Nbd2 c5 11. Rc1 Rc8 12. dxc5 bxc5 13. Ne5 Nxe5 14. Bxe5 Qb6 15. Qc2 Rfd8 16. Rfd1 Bd6 17. Bxf6 gxf6 18. Nf3 Be5 19. Nh4 d4 20. Be4 Bxe4 21. Qxe4 f5 22. Qxe5 Rd5 23. Qxd5 Qc6 24. Qxf5 Qf3 25. Qxc8+ Kg7 26. exf3 d3 27. Rxc5 d2 28. Rc6 h6 29. Nf5+ Kh7 30. Rxh6#

bought Ṁ0 NO

Gemini 3 played well in the opening and had a large advantage, but played poorly in the endgame and lost. It seems to me to be the one of the strongest models so far, around the same level as GPT-4.5.

1. d4 Nf6 2. c4 e6 3. Nf3 b6 4. g3 Ba6 5. Nbd2 Bb7 6. Bg2 c5 7. e4 cxd4 8. e5 Ng4 9. h3 Nxe5 10. O-O Nxf3+ 11. Nxf3 Be7 12. Nxd4 Bxg2 13. Kxg2 O-O 14. Qf3 d5 15. Rd1 Bf6 16. cxd5 Qxd5 17. Qxd5 exd5 18. Nb5 Nc6 19. Nc7 Rad8 20. Nxd5 Ne7 21. Nxf6+ gxf6 22. Bh6 Rfe8 23. a4 Nf5 24. Bf4 Rxd1 25. Rxd1 Re4 26. b3 h5 27. Rd5 Nd4 28. f3 Re2+ 29. Kf1 Nxb3 30. Kxe2

bought Ṁ5,942 NO

GPT 5.1 played well in the opening but lost.

1. d4 d5 2. c4 e6 3. Nc3 Nf6 4. cxd5 exd5 5. Bg5 Be7 6. e3 O-O 7. Bd3 Nbd7 8. Qc2 c5 9. dxc5 Nxc5 10. Nf3 Nxd3+ 11. Qxd3 Qb6 12. O-O Be6 13. Rfd1 Rfd8 14. Nd4 Rac8 15. Qb5 Bd7 16. Qxb6 axb6 17. Bxf6 Bxf6 18. Nxd5 Bxd4 19. Rxd4 Kf8 20. Nxb6 Rc6 21. Nxd7+ Rxd7 22. Rxd7 Ke8 23. Rd2 Ke7 24. g3 Rc1+ 25. Rxc1

bought Ṁ6,449 NO

Kimi K2 lost due to the 3 illegal move rule. This was the game until then:

1. d4 d5 2. c4 dxc4 3. e4 Nf6 4. Nc3 e5 5. Nf3 Nc6 6. d5 Ne7 7. Nxe5

I played this game on a text file in Cursor. Suprisingly, it often tried to edit the move which I had already played!

Pretty sure you cant' beat a GPT-3. It distilled stockfish

@clementdupOz I would bet that this is false, but in any case I don’t have access to GPT-3 as it is depreciated.

reposted

cool market!

bought Ṁ8,379 NO

Sonnet 4.5 played well in the opening but shortly after blundered a piece and an exchange.

1. d4 d5 2. c4 dxc4 3. e4 Nf6 4. e5 Nd5 5. Bxc4 Nb6 6. Bd3 Nc6 7. Ne2 Bg4 8. f3 Be6 9. Be3 Qd7 10. Nbc3 O-O-O 11. Be4 f5 12. exf6 exf6 13. O-O Bd6 14. Kh1 Rhe8 15. d5 Nxd5 16. Nxd5 Bxd5 17. Qxd5 Rxe4 1-0

bought Ṁ20,302 NO

GPT-5 played well in the opening but lost.

1. d4 Nf6 2. c4 e6 3. g3 d5 4. Nf3 Be7 5. Bg2 O-O 6. O-O dxc4 7. Qc2 a6 8. a4 Bd7 9. Ne5 Nc6 10. Nxc6 Bxc6 11. Bxc6 bxc6 12. Rd1 Qd5 13. Na3 Bxa3 14. Rxa3 Rfd8 15. Rc3 Qe4 16. Rxc4 c5 17. Qxe4 Nxe4 18. f3 Nf6 19. Rxc5 Nd5 20. e4 Nb4 21. Rxc7 Rac8 22. Bf4 Nc2 23. Rc1 Nb4 24. Rxc8 Rxc8 25. Rxc8#

bought Ṁ8,125 NO

Grok 4 played passively in the opening and lost. However, I was impressed by its defense and in particular by 16. f5! I think it’s weaker than GPT 4.5 but stronger than some of the other models. The game is below:

1. b3 e5 2. Bb2 Nc6 3. g3 Nf6 4. Bg2 Bc5 5. d3 O-O 6. Nd2 d6 7. Ngf3 Bg4 8. h3 Bh5 9. O-O Re8 10. g4 Bg6 11. e4 Qd7 12. Nc4 Nd4 13. Nxd4 exd4 14. Nd2 h5 15. g5 Nh7 16. f4 f5 17. gxf6 Nxf6 18. Nf3 Bxe4 19. dxe4 Nxe4 20. Bxd4 Bxd4+ 21. Qxd4 c5 22. Qd5+ Kh8 23. Qxh5+ Kg8 24. Ng5 Nxg5 25. fxg5 Re2 26. Bd5+ Qf7 27. Qxf7+ Kh8 28. Qh5#

bought Ṁ7,991 NO

Here is my game against Llama 4 Maverick:

1. d4 d5 2. Nc3 Nf6 3. Bf4 c6 4. Qd2 g6 5. O-O-O Bg7 6. Kb1 b5 7. f3 a5 8. h4 h5 9. e4 dxe4 10. Nxe4 Nxe4 11. fxe4 Qd5 12. exd5 1-0

I wasn’t able to find a model provider for Llama 4 Behemoth, but given this performance, I am not planning to play any other Llama 4 models. Therefore I am resolving “Llama 4” to NO.

Claude 4 Opus played decently in the opening but very quickly lost the plot. Since I don’t plan to play any other Claude 4 models, I am resolving “Any Claude 4 model” to NO.

1. d4 d5 2. c4 e6 3. Nc3 Nf6 4. cxd5 exd5 5. Bg5 Be7 6. e3 O-O 7. Bd3 Nbd7 8. Nf3 c6 9. Qc2 Re8 10. O-O h6 11. Bh4 Nf8 12. Ne5 Be6 13. f4 N6d7 14. Bxe7 Qxe7 15. Rae1 Nxe5 16. fxe5 Ng6 17. Bxg6 fxg6 18. Qxg6 Bf7 19. Qd3 Rad8 20. a3 Bg6 21. Qxg6 Rf8 22. Rf4 Qe6 23. Qxe6+ 1-0

In order to prevent wasting time in won positions, especially with models which use a lot of inference time compute, I am implementing a new rule. If I judge that my opponent’s position is hopelessly lost, at the level of being down a rook without compensation, I will submit the current position to a friend. If they agree that the position is lost, the game will be adjudicated as a win for me.

bought Ṁ21,572 NO

OpenAI o3 played poorly throughout the game and made some strange sacrifices.

1. d4 d5 2. c4 e6 3. Nc3 Nf6 4. cxd5 exd5 5. Bg5 Be7 6. e3 O-O 7. Bd3 c5 8. dxc5 Nbd7 9. Nf3 Nxc5 10. Be2 Be6 11. O-O Nce4 12. Nxe4 dxe4 13. Nd4 Qb6 14. Nxe6 Qxe6 15. Qa4 h6 16. Bh4 Nd5 17. Bg3 Nxe3 18. fxe3 Bf6 19. Bc4 Qe7 20. Rad1 Qc5 21. Qb3 Qxe3+ 22. Qxe3 Rfe8 23. Rd7 Re7 24. Rxe7 Bxe7 25. Rxf7 Re8 26. Qc3 Bf6 27. Rxf6+ Kh8 28. Rxh6# 1-0

A clarification is that if the game ends in a draw, the “Any model before X year” options will not resolve 50%. These options resolve either YES or NO depending on whether any models are able to win before X year.

bought Ṁ8,949 NO

Here is the Gemini 2.5 game. In the final position Gemini resigned.

1. e4 c6 2. d4 d5 3. e5 Bf5 4. c3 e6 5. h4 h6 6. Nd2 Nd7 7. Ngf3 Ne7 8. Be2 c5 9. Nf1 Nc6 10. Ne3 Bh7 11. g4 Qb6 12. g5 cxd4 13. cxd4 Bb4+ 14. Kf1 O-O 15. gxh6 gxh6 16. Ng4 f5 17. Nxh6+ Kh8 18. Bf4 Be7 19. Kg2 Rg8+ 20. Kh3 Raf8 21. Rg1 Rxg1 22. Qxg1 Ndxe5 23. Nxe5 Nxe5 24. Bxe5+ Bf6 25. Rc1 Qd8 26. Qg5 1-0

© Manifold Markets, Inc.TermsPrivacy