This market predicts whether ChatGPT using the most powerful available GPT-4 model, assisted by Mira, will beat a selected human opponent in a game of chess. The game will be played on Lichess as a correspondence game, with ChatGPT taking the white pieces to provide a slight advantage.
Resolves YES if:
ChatGPT, with Mira's assistance, wins the chess game against the selected human opponent.
Resolves 50% if:
The game ends in a draw.
Resolves NO if:
The selected human opponent wins.
Resolves as NA if:
The game cannot be completed within 1 week after market close.
The selected human opponent is found to be using a chess engine or receives help from another person.
ChatGPT is judged to have been given unfair prompting.
The market creator retains the right to mark this market as NA or to modify the rules within the first week for any reason or no reason.
The selected human opponent is someone who knows the rules of chess but doesn't play frequently and will not be allowed the use of a chess engine during the game. They have stated "I will not practice, but I will try my hardest to win".
Mira will act as a human assistant for ChatGPT, writing combinations of 6 types of prompts for ChatGPT:
Provide the current state of the board. (I plan to use PGN and FEN.)
Request a list of candidate moves along with explanations.
Request an analysis of a specific move and its likely continuation. Mira is not allowed to select a specific move for analysis; ChatGPT must select the list of moves to analyze.
Request a ranking of moves from a previously generated list.
Request a specific move be finally chosen given all of the above analysis.
Notification that a move is illegal, along with an explanation of why.
A test game has been completed against Bing using only turns of PGN-formatted moves, so these should be sufficient. If additional prompts are needed, Mira will exercise subjective judgment. Mira and ChatGPT will not be allowed to access any chess engine during the game. Mira will provide a transcript of prompts used.
If there is dispute about whether a prompt was unfair(such as by leaking preference for certain moves to ChatGPT), the human opponent will be allowed to review the transcript, the discussion, and judge whether ChatGPT was given unfair advantage.
There is a 1 week time limit on completion of the game after the market closes. Otherwise, no strict time limit for either side on individual moves. If the opponent intentionally delays the game to run out the 1 week time limit, the market would resolve NA but Mira would be disappointed in them.
To avoid conflict of interest, Mira will not bet more than a token amount(M$10) in this market.
@Odoacre I played a game of chess against Bing and was 2 pawns down before it blundered its queen 14 moves in. The prompt engineering in this market is mainly to prevent blunders. I plan to test more prompts against myself closer to market close, before doing the market's game.
I rarely play chess, though I have implemented chess engines before. I'm probably less than 1200 ELO.
https://dkb.blog/p/chatgpts-chess-elo-is-1400 - and this is without the sort of prompt engineering that Mira will be able to do, so presumably with Mira's assistance it can do even better.
Beginner ELOs are typically below 1000. So seems like ChatGPT should be favored to win here.
Can you give an idea of the selected human's skill level? Who do they play against? Do they know their ELO rating?
@jack They pretty much never play chess. They know the rules and have played it before, but I'd estimate it's a "once every couple years" activity. They don't have an account at any chess sites, and have not been assigned an ELO rating.
They've played other chess-like games before(Luzhanqi a lot, as a kid), and can compete at a reasonably high level in competitive multiplayer video games such as DotA 2 (top 1%) given some practice, so it's possible that being generally smart and having time to plan out moves will give them an edge over the average beginner.