Will a Large Language Model beat me at chess this year?

Ṁ100Ṁ873

resolved Jan 1

Resolved

ALL

I’m rated around 1900 FIDE. At the end of the 2024 I’ll play a game against an LLM at a rapid time control, selected from the top 3 of the leaderboard (https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard). Resolves YES if I lose, NO if I win, and 50% for a draw.

Market context

Technical AI Timelines

Get

1,000

to start trading!

🏅 Top traders

#	Trader	Total profit
1		Ṁ31
2		Ṁ29
3		Ṁ22
4		Ṁ19
5		Ṁ17

People are also trading

Will a large language model beat a super grandmaster playing chess by EOY 2028?

41% chance

Which of these Language Models will beat me at chess?

When will a Large Language Model beat me at chess?

Will a large language model beat a super grandmaster playing chess by EOY 2028?

60% chance

Will a large language model beat a super grandmaster at chess by EOY 2035?

68% chance

Will a Language Model under 10B parameters play chess at Grandmaster level by 2050?

88% chance

Will an LLM from OpenAI beat me in chess by the end of 2026?

50% chance

Will an LLM from OpenAI beat me in chess by the end of 2028?

77% chance

Will an AI by OpenAI beat a super grandmaster playing chess by 2028?

50% chance

Will an LLM beat me in a game of chess by the end of 2027?

62% chance

Sort by:

Thanks for all traders who participlated in this market. I played a game against o1, which I won quite easily. Here is the PGN:

1. e4 e5 2. Nf3 Nc6 3. Bb5 Nf6 4. O-O Nxe4 5. Re1 Nd6 6. Nxe5 Nxe5 7. Rxe5+ Be7 8. d4 Nxb5 9. c4 Nd6 10. c5 Nc4 11. Re2 O-O 12. b3 Na5 13. Nc3 d6 14. Bf4 Bg4 15. Nd5 Bxe2 16. Qxe2 Nc6 17. cxd6 Bxd6 18. Rd1 Re8 19. Bxd6 Rxe2 20. Nxc7 Qxd6 21. Nb5 Rae8 22. Nxd6 Re1+ 23. Rxe1 Rxe1#

If you're interested in markets like these, please check out my new market which includes GPT-5, Grok 3, Claude 3.5 Opus, and others:

Is this going to be resolved

@Blocksterpen3 working on it today

we can only hope.

What prompt will you be using? I imagine that changes their performance quite a bit

Good point! On each move, I’ll provide it the moves played so far in PGN notation, as well as the current position in FEN notation. This way both ways of representing position would be in context and in a standard format.

I think that makes the model significantly worse than it could otherwise be. I'd recommend using whatever prompt someone that claims "SOTA LLM chess" or something came up with

I’m planning to use lichess to play the game, and those are the representations it provides. In a future market this might change.

bought Ṁ43 NO

When I tested this with ChatGPT 3.0 a while back, it couldn't even remember the board position and kept making illegal moves. How will you resolve if it does this?