Will a LLM beat human experts on GPQA by Jan 1, 2025?
Basic
46
Ṁ16kJan 2
29%
chance
1D
1W
1M
ALL
GQPA dataset here: https://arxiv.org/abs/2311.12022
"Human expert" means 74%.
Currently, GPT-4 gets 39%.
The LLM is allowed to use external tools (e.g. Google, Wolfram Alpha).
Get Ṁ1,000 play money
Related questions
Will Google have a better LLM than OpenAI by 2025?
35% chance
Will an LLM (a GPT-like text AI) defeat the World Champion at Chess before 2035?
54% chance
Will an open-source LLM beat or match GPT-4 by the end of 2024?
81% chance
Will the most interesting AI in 2027 be a LLM?
37% chance
Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?
16% chance
Will an opensource LLM on huggingface beat an average human at the most common LLM benchmarks by July 1, 2024?
74% chance
Will AIs beat human experts in question-answering on the GPQA benchmark before January 1st, 2027?
85% chance
Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?
44% chance
Will OpenAI's next major LLM (after GPT-4) surpass 74% accuracy on the GPQA benchmark?
55% chance
Will OpenAI's next major LLM (after GPT-4) surpass 70% accuracy on the GPQA benchmark?
56% chance
Sort by:
Related questions
Related questions
Will Google have a better LLM than OpenAI by 2025?
35% chance
Will an opensource LLM on huggingface beat an average human at the most common LLM benchmarks by July 1, 2024?
74% chance
Will an LLM (a GPT-like text AI) defeat the World Champion at Chess before 2035?
54% chance
Will AIs beat human experts in question-answering on the GPQA benchmark before January 1st, 2027?
85% chance
Will an open-source LLM beat or match GPT-4 by the end of 2024?
81% chance
Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?
44% chance
Will the most interesting AI in 2027 be a LLM?
37% chance
Will OpenAI's next major LLM (after GPT-4) surpass 74% accuracy on the GPQA benchmark?
55% chance
Will any LLM outrank GPT-4 by 150 Elo in LMSYS chatbot arena before 2025?
16% chance
Will OpenAI's next major LLM (after GPT-4) surpass 70% accuracy on the GPQA benchmark?
56% chance