Will a large language model beat a super grandmaster playing chess by 2028?

1.9k

4.4kṀ1.4m

2029

60%

chance

ALL

If a large language models beats a super grandmaster (Classic elo of above 2,700) while playing blind chess by 2028, this market resolves to YES.

I will ignore fun games, at my discretion. (Say a game where Hiraku loses to ChatGPT because he played the Bongcloud)

Some clarification (28th Mar 2023): This market grew fast with a unclear description. My idea is to check whether a general intelligence can play chess, without being created specifically for doing so (like humans aren't chess playing machines). Some previous comments I did.

1- To decide whether a given program is a LLM, I'll rely in the media and the nomenclature the creators give to it. If they choose to call it a LLM or some term that is related, I'll consider. Alternatively, a model that markets itself as a chess engine (or is called as such by the mainstream media) is unlikely to be qualified as a large language model.

2- The model can write as much as it want to reason about the best move. But it can't have external help beyond what is already in the weights of the model. For example, it can't access a chess engine or a chess game database.

I won't bet on this market and I will refund anyone who feels betrayed by this new description and had open bets by 28th Mar 2023. This market will require judgement.

Update 2025-21-01 (PST) (AI summary of creator comment): - LLM identification: A program must be recognized by reputable media outlets (e.g., The Verge) as a Large Language Model (LLM) to qualify for this market.
- Self-designation insufficient: Simply labeling a program as an LLM without external media recognition does not qualify it as an LLM for resolution purposes.

Update 2025-06-14 (PST) (AI summary of creator comment): The creator has clarified their definition of "blind chess". The game must be played with the grandmaster and the LLM communicating their respective moves using standard notation.

AI Capabilities

Technology

Technical AI Timelines

Chess

LLM Chess

Get

1,000

to start trading!

People are also trading

When will a Large Language Model beat me at chess?

Will a Language Model under 10B parameters play chess at Grandmaster level by 2050?

88% chance

Will an LLM beat a Super GM Bot on chess.com by 2028?

50% chance

Which of these Language Models will beat me at chess?

Will an AI model achieve superhuman ELO on Codeforces by the 31 December 2025?

41% chance

Will an LLM (a GPT-like text AI) defeat the World Champion at Chess before 2035?

72% chance

Large language model at least diamond in StarCraft 2 by 2028?

15% chance

Will there be a new youngest Chess grandmaster before the end of 2025?

33% chance

Will a publicly available LLM/Agent beat a 2000 rated Elo chess player online rapid chess by March 2027?

85% chance

Which chess engine will be the strongest at the end of 2028?

462 Comments

1.7k Holders

7.7k Trades

Sort by:

filled a Ṁ20,000 YES at 70% order

Deepmind seems to have achieved this already: Grandmaster-Level Chess Without Search

"Unlike traditional chess engines that rely on complex heuristics, explicit search, or a combination of both, we train a 270M parameter transformer model with supervised learning on a dataset of 10 million chess games. We annotate each board in the dataset with action-values provided by the powerful Stockfish 16 engine, leading to roughly 15 billion data points. Our largest model reaches a Lichess blitz Elo of 2895 against humans, and successfully solves a series of challenging chess puzzles, without any domain-specific tweaks or explicit search algorithms."

@TimothyJohnson5c16 I don't think this counts, since it isn't a general purpose LLM, it's a chess-specific transformer model.

The author says: "My idea is to check whether a general intelligence can play chess, without being created specifically for doing so (like humans aren't chess playing machines)."

@NathanHelmBurger Ah, okay, good point.

I worry that a grandmaster would fraternize deez nutz

☝️🤓 there's no rule stating all the moves needs to be legal

@Quillist yes there is. The model needs to "beat" the human "at chess". A game where one or more players can make up rules during play is not "chess". And if you are playing chess, you've not "beaten" your opponent if you've broken the rules.

@Fion ☝️🤓 GothemChess is the most popular chess content producer - making him the ultimate authority on what the media thinks the rules of chess are, and in his tournaments AIs hallucinations are valid moves

it can't have external help beyond what is already in the weights of the model. For example, it can't access a chess engine or a chess game database.

can it write a chess engine and then run it?

@Wott I am not the market poster, but I think that would not be allowed. Writing and running a chess engine would imo be the same as accessing one.

Also, I personally doubt it could even do that successfully without heavily using external libraries, which I feel would go against the spirit of the market.

I think a better question to ask @MP is whether the model is allowed to run code at all.

it can't have external help beyond what is already in the weights of the model.

This seems to me as a no to code execution, quite clearly stated.

Yeah like I can write a chess engine that plays better than me, but that doesn't mean my chess elo is that high.

@Irigi Unless it executes the code "by hand" inside its chain-of-thought, I suppose.

@placebo_username Do you think it's possible to create a chess engine program whose logic can be "executed" completely in text? What might be the minimal length of such a chain of thought, for each move?

@AhronMaline @AhronMaline yes this is technically possible, you as a human could execute a chess algorithm on paper. My opinion is this would be terribly inefficient.

@KeithManning I mean sure, you could also simulate the Universe by arranging rocks in the desert... there's some order of magnitude where we can't call such things "possible".

@AhronMaline I agree, but you don’t have to write the whole move possibility tree for the game, that would be very impossible, traditional algorithmic chess engines don’t do this. You can instead use it to help you calculate some number of moves into the game, and then look at those board states and decide if they’re advantageous for you, even if you’re not calculating all the way to the end of the game.

This is how the core of chess engines work, and a human or LLM could theoretically do this with some paper (or chain of thought tokens) to write stuff down in, the problem is it’s orders of magnitude slower than chess engines that are designed just for this purpose and are tuned to modern computer hardware.

@KeithManning you'd need to take a chess engine, I guess a primitive one like Deep Blue, and manually carry out every logic step and FLOP. Every nested For loop. I doubt this could be done with less than... ten billion tokens per move? But I'm just spitballing.

We could make a market about this, and test by adding counters (or even Print statements) at each step in some codebase.

@AhronMaline nah it wouldn't be that complicated, the LLM would reason about it similarly to how a human does:

Use your intuition to think of a few good candidate moves
Apply each of those moves, visualise the resulting board state and see if they're advantageous to you
Do the same for your opponents turn on that visualised board
Repeat a couple times, then decide which of the original candidate moves is best

It wouldn't be running exact computer instructions like deep blue, but I agree it still won't be very efficient. I am betting very heavily NO on this market, so we are in agreement here, but I could see how LLMs could become semi-decent at chess if they were trained on it. The thing is I don't think they will be trained on it because it's not a good use of AI companies' times, because it will be orders of magnitude less efficient than actual chess engines.

@KeithManning That's a different story! Of course, that sort of "algorithm" is just how one reasons about chess. But it uses plenty of intuition that must be learned. That's not what "executing the code for a chess engine" should mean.

@AhronMaline I didn’t imagine we were talking about executing the low level logic of a chess engine, perhaps I misinterpreted the conversation.

Regardless, I would still call this process that I described an “algorithm”, it just uses natural language instead of machine code. The intuition would have to be learned yes, but less would be needed the more moves are calculated.

Until we can calculate the entire move tree (which we probably never will), intuition is needed for any chess engine, even traditional ones, modern chess engines implement “intuition” as a neural network that is trained to determine how good a board state is for the player, old school ones used a complicated heuristic function.

@KeithManning Okay then, I don't think we disagree. I understood @placebo_username to be suggesting that an LLM could win without the "intuition" that comes from specialized chess training, by writing down the code for an explicit, heuristic-based algorithm and then following it step by step. And I'm arguing that the sheer number of tokens involved in such an execution puts it completely out of reach.

In the unlikely case that an LLM does get really good at chess, it will be because training efficiency becomes good enough to learn excellent chess intuition just from non-specialized training on the data that's out there.

@AhronMaline I was indeed suggesting that even a low- or "medium"-level algorithm could be implemented, with overhead like a few tokens per bitwise operation or something. Obviously Deep Blue would be very expensive to implement in that fashion, but it's not optimized for that kind of efficiency anyways. Has anyone written a superhuman chess strategy that runs on a Raspberry Pi? How about just a board position evaluation heuristic, since doing the tree search part in naturalish language seems easier?

@placebo_username yeah, so I stand by what I said - that's much much too many tokens. But I'm just guessing. Would like to see a market about this.

I’ve played chess, and I’ve studied machine learning and traditional chess algorithms. An “LLM” in its current form is not suited for this task, super GMs are superhuman, you guys will come to see it.

@KeithManning Super GMs are human and therefore not superhuman

@Bayesian That’s correct, my message was not literal. I guess my point was that relative to an everyday person, these super GMs are nowhere near the same order of magnitude in terms of skill.

Traditional chess engines are superhuman, they are another order of magnitude better than super GMs.

LLMs don’t show anywhere near the same efficiency as chess engines, as they are generalised language completion functions, they are not specialised for chess. The amount of energy and processing power required to emulate the operation of a chess engine with an LLM would make it extremely inefficient.

With current technology, I could see an LLM using chain of thought to emulate a chess engine, however this would be so inefficient I can’t see it being fast enough.

If some new model architecture comes out that can play chess as well as do all of the current LLM functions, I believe it would be different enough to warrant not being labelled an LLM for the purposes of this question.

@KeithManning perhaps one might say grandmasters are supermodel

@KeithManning How would you feel about an identical market with a 2035 deadline?

@placebo_username 2035 llm will be obsolete

I agree with @JussiVilleHeiskanen the time is irrelevant here, as imo the LLM in its current form will never be suited for chess.

@KeithManning You should make the 2035 market, I think you could get a much better price. If you also dropped the weird blindness restriction, I think people would bet it up to like 80%.

sold Ṁ332 NO

@placebo_username https://manifold.markets/KeithManning/will-a-large-language-model-beat-a-dy8yzPLg9A?r=S2VpdGhNYW5uaW5n

@KeithManning I still think that GPT-5-thinking on high mode, and GPT-5-pro are general purpose LLMs for the sake of this question, based on my estimation of the author's intent. I don't think GPT-5-thinking is anywhere near GrandMaster level, but it certainly is demonstrating that general LLMs are progressing in Chess skill, since it's a lot better than GPT-4. I don't think we can rule out for sure that there will be continued progress on models that are still within the 'general purpose LLM' regime.