
Currently large language models trained on predicting text are comparable with puposefully designed translation systems in effectiveness, but are definitely inferior to other narrow AI systems in things like playing chess.
I will resolve this market YES if at the end of the year the best engines for playing chess need to be secifically trained for the task (i.e. either adversarially playing games, or on large datasets consisting mainly of old games, things like that) and NO if LLMs trained on general texts or other similar broad- purpose AIs are able to win a majority of games against the best available narrow models.
I will not consider multi-purpose AI systems explicitly designed with narrow game-playing subsystems, or able to perform API calls to chess engines, anything like that. I will consider broad AIs with access to calculators, program interpreters and the internet as long as it is possible to determine that they are not calling a chess engine.
Market resolves to my judgment on the state of affairs on 01/01/2024, and N/A if there is a gross lack of information (i.e. i cannot get access to state of the art models for testing and/or there is a strong lack of consensus among experts.
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ105 | |
2 | Ṁ66 | |
3 | Ṁ51 | |
4 | Ṁ36 | |
5 | Ṁ12 |
People are also trading
The best chess engines don’t win the majority of games against other chess engines - 73/100 games in Leela v stockfish were draws https://en.m.wikipedia.org/wiki/TCEC_Season_19
For any model to win the majority of games against engines would imply an enormous leap for chess engines in general, not just generalist AIs.
Would you interpret “win the majority of games” to exclude draws?
@DanMan314 good point I hadn't thought of that. Yes, I'll think about exact formulations for a bit and then edit the description accordingly, thanks.