Will a LLM beat human experts on GPQA by Jan 1, 2025?
30
205
Ṁ5.3KṀ815
2025
49%
chance
1D
1W
1M
ALL
GQPA dataset here: https://arxiv.org/abs/2311.12022
"Human expert" means 74%.
Currently, GPT-4 gets 39%.
The LLM is allowed to use external tools (e.g. Google, Wolfram Alpha).
Get Ṁ200 play money
Related questions
Will an opensource LLM on huggingface beat an average human at the most common LLM benchmarks by July 1, 2024?
79% chance
Will LLMs mostly overcome the Reversal Curse by the end of 2025?
63% chance
China will make a LLM approximately as good or better than GPT4 before 2025
64% chance
Will an open-source LLM beat or match GPT-4 by the end of 2024?
64% chance
Will an LLM (a GPT-like text AI) defeat the World Champion at Chess before 2035?
47% chance
Will Google have the best LLM by EOY 2024?
35% chance
Will a LLM-based AI be used for a law enforcement decision before 2025?
63% chance
Will LLMs be better than typical white-collar workers on all computer tasks before 2026?
27% chance
Will Google have a better LLM than OpenAI by 2025?
31% chance
Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?
45% chance