Will general purpose AI models beat average score of human players in Diplomacy by 2028?
Plus
10
Ṁ11712027
60%
chance
1D
1W
1M
ALL
General purpose (not trained for a specific task) language models demonstrated chess playing ability. They are also capable of deception and lie detection. Will language models or visual-language models* beat the average score of human players during a series of 40 games on webDiplomacy.net by 2028? (question modeled after Meta's Cicero result).
[EDIT: Please notice that while "CICERO achieved more than 2x the average score of its opponents" this question requires only achieving the above-average score]
*models or agents trained on different modalities (so e. g. models capable of controlling robotic arm like PaLM-E) would also qualify as long as they weren't trained specifically to play Diplomacy
This question is managed and resolved by Manifold.
Get
1,000and
3.00
Related questions
Related questions
Will a large language model beat a super grandmaster playing chess by 2028?
68% chance
Will AI image generating models score >= 90% on Winoground by June 1, 2025?
77% chance
Will AI beat top human players at Civ6 (without cheating) by EOY 2026?
35% chance
Will an AI score 1st place on International Math Olympiad (IMO) 2025?
31% chance
Will an AI model achieve superhuman ELO on Codeforces by the 31 December 2025?
68% chance
Will an AI model outperform 95% of Manifold users on accuracy before 2026?
60% chance
Will an AI be capable of achieving a perfect score on the Putnam exam before 2026?
33% chance
Will any AI model score >80% on Epoch's Frontier Math Benchmark in 2025?
19% chance
Will an AI system beat humans in the GAIA benchmark before the end of 2025?
65% chance
What will be the best AI performance on Humanity's Last Exam by December 31st 2025?