xAI Grok will beat OpenAI's flagship model on HumanEval benchmarks by the end of 2024.
72
203
Ṁ3.1KṀ1.1K
2025
13%
chance
1D
1W
1M
ALL
This is inclusive of any new models OpenAI unveils in 2024, but the question resolves to "yes" if Grok beats OpenAI at any time in 2024 against their current state of the art model.
Get Ṁ200 play money
Sort by:
bought Ṁ10 YES from 69% to 71%
@DanMan314 67.0% is the HumanEval figure from the original GPT-4 report published more than a year ago. The current zero-shot GPT-4 performance, as reported by Papers With Code, is 76.5%, which is from Guo et al. (January 2024).
Note that the market creator is banned, so this will probably be resolved by moderators. Personally, I think the current version of GPT-4 is the more natural interpretation of "OpenAI's flagship model" than the original version of GPT-4.
Related questions
Will there be an AI language model that surpasses ChatGPT and other OpenAI models before the end of 2024?
31% chance
Gemini Ultra will achieve a higher rating than an OpenAI's GPT-4 model on Chatbot Arena Leaderboard before May 1st 2024
90% chance
Will OpenAI be in the lead in the AGI race end of 2026?
55% chance
Will an AI by OpenAI beat a super grandmaster playing chess by 2028?
47% chance
Will OpenAI announce a major breakthrough in AI alignment in 2024?
38% chance
Will general purpose AI models beat average score of human players in Diplomacy by 2028?
60% chance
Will AI beat top Magic the Gathering human player before the end of 2026?
40% chance
Will there be an AI language model that surpasses ChatGPT and other OpenAI models before the end of 2025?
64% chance
Will AIs beat human experts in question-answering on the GPQA benchmark before January 1st, 2027?
78% chance