Will general purpose AI models beat average score of human players in Diplomacy by 2028?
10
81
แน1.2Kแน190
2027
60%
chance
1D
1W
1M
ALL
General purpose (not trained for a specific task) language models demonstrated chess playing ability. They are also capable of deception and lie detection. Will language models or visual-language models* beat the average score of human players during a series of 40 games on webDiplomacy.net by 2028? (question modeled after Meta's Cicero result).
[EDIT: Please notice that while "CICERO achieved more than 2x the average score of its opponents" this question requires only achieving the above-average score]
*models or agents trained on different modalities (so e. g. models capable of controlling robotic arm like PaLM-E) would also qualify as long as they weren't trained specifically to play Diplomacy
Get แน200 play money
Related questions
Will a large language model beat a super grandmaster playing chess by 2028?
50% chance
In 2028, will an AI be able to play randomly selected computer games at human level without getting to practice?
35% chance
In 2028, will an AI be able to play randomly-selected computer games at human level, given the chance to train via self-play?
65% chance
Will AI be able to generate correct images of a chess game in 2024?
49% chance
Will an AI by OpenAI beat a super grandmaster playing chess by 2028?
47% chance
Will an AI model outperform 95% of Manifold users on accuracy before 2026?
67% chance
Will AI reach human-level performance in Magic: The Gathering before 2026?
50% chance
Will AI beat top Magic the Gathering human player before the end of 2026?
40% chance
Will AI beat top Magic the Gathering human player before the end of 2028?
72% chance
Will an AI be capable of achieving a perfect score on the Putnam exam before 2028?
19% chance