Will kenshin9000 release a GPT-4 based chess engine by end of January?
10
210Ṁ655
resolved Jan 30
Resolved
NO

Resolves YES if kenshin9000 releases code which is strictly GPT-4 based,

and anyone is able to use it to play a single game of chess against a chess engine. NO if it's end of January and he still hasn't released any code.

For this market, it has to run without calling chess engines (or equivalent); game mechanical support libraries like python-chess are allowed, but only to a degree when neither move suggestions nor evaluations are taken from anywhere but GPT-4.
The code should make no illegal moves in 100 test games.

See also Mira's question:
Will kenshin9000 release a functioning chess engine by end of January?

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ31
2Ṁ10
3Ṁ5
4Ṁ5
5Ṁ2
Sort by:
predictedNO

Note that kenshin9000_ is now claiming to have reached "just about 3800" Elo with with Llama2-70B, and expects ~3900 vs SF16 with GPT4 (but also he had failed to deliver anything yet)

predictedNO

For some background info on how unaware GPT-3.5 is of the chess positions fed to it (even with the custom crafted prompts kenshin9000_ is using), here is a hilarious example. This comes from his post on Jun 2, 2023.

I copied his lead-up prompt into a public OAI chat, and added a request for ChatGPT to draw an ASCII board representation. It wrote in response, with a board shown all wrong, and the textual description different from its own drawing:

8 | . . . . . . . .

7 | . . . . . . . .

6 | . . . k . K . .

5 | . . . . . . P .

4 | P . P . B . . .

3 | . P b . . . . .

2 | . . B . . . . .

1 | . . . . . . . .

---------------

a b c d e f g h

In this position:

Black king is on e6. actually on d6

White king is on g4. actually on f6

Black has a pawn on b3. actually nowhere

Black has a bishop on b6. actually on c3

White has a pawn on g7. actually on g5

moreover White has two bishops, as well as 3 more pawns, unlisted

This is the level of vaunted "chess knowledge" his upcoming chess engine is supposedly based on (besides whatever channeled from training against Strockfish)!

predictedNO

See update from kenshin9000 on his progress, as it were.
He claims that with his monster code running a single game will cost >$50, so proper testing as I envisioned would be prohibitively expensive. I will revise the resolution criteria to something reasonable. Unfortunately, given the vagueness in his proclamations, this is difficult to pin down currently.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules