
Resolves YES if kenshin9000 releases code which is strictly GPT-4 based,
and anyone is able to use it to play a single game of chess against a chess engine. NO if it's end of January and he still hasn't released any code.
For this market, it has to run without calling chess engines (or equivalent); game mechanical support libraries like python-chess are allowed, but only to a degree when neither move suggestions nor evaluations are taken from anywhere but GPT-4.
The code should make no illegal moves in 100 test games.
See also Mira's question:
Will kenshin9000 release a functioning chess engine by end of January?
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ31 | |
2 | Ṁ10 | |
3 | Ṁ5 | |
4 | Ṁ5 | |
5 | Ṁ2 |
People are also trading
Note that kenshin9000_ is now claiming to have reached "just about 3800" Elo with with Llama2-70B, and expects ~3900 vs SF16 with GPT4 (but also he had failed to deliver anything yet)
For some background info on how unaware GPT-3.5 is of the chess positions fed to it (even with the custom crafted prompts kenshin9000_ is using), here is a hilarious example. This comes from his post on Jun 2, 2023.

I copied his lead-up prompt into a public OAI chat, and added a request for ChatGPT to draw an ASCII board representation. It wrote in response, with a board shown all wrong, and the textual description different from its own drawing:
8 | . . . . . . . .
7 | . . . . . . . .
6 | . . . k . K . .
5 | . . . . . . P .
4 | P . P . B . . .
3 | . P b . . . . .
2 | . . B . . . . .
1 | . . . . . . . .
---------------
a b c d e f g h
In this position:
Black king is on e6. actually on d6
White king is on g4. actually on f6
Black has a pawn on b3. actually nowhere
Black has a bishop on b6. actually on c3
White has a pawn on g7. actually on g5
moreover White has two bishops, as well as 3 more pawns, unlisted
This is the level of vaunted "chess knowledge" his upcoming chess engine is supposedly based on (besides whatever channeled from training against Strockfish)!
See update from kenshin9000 on his progress, as it were.
He claims that with his monster code running a single game will cost >$50, so proper testing as I envisioned would be prohibitively expensive. I will revise the resolution criteria to something reasonable. Unfortunately, given the vagueness in his proclamations, this is difficult to pin down currently.