Will a LLM beat human experts on GPQA by Jan 1, 2025?
Will a LLM beat human experts on GPQA by Jan 1, 2025?
57
1kṀ42k
resolved Dec 20
Resolved
YES

GQPA dataset here: https://arxiv.org/abs/2311.12022

"Human expert" means 74%.

Currently, GPT-4 gets 39%.

The LLM is allowed to use external tools (e.g. Google, Wolfram Alpha).

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ1,198
2Ṁ880
3Ṁ731
4Ṁ621
5Ṁ605

What is this?

What is Manifold?
Manifold is the world's largest social prediction market.
Get accurate real-time odds on politics, tech, sports, and more.
Or create your own play-money betting market on any question you care about.
Are our predictions accurate?
Yes! Manifold is very well calibrated, with forecasts on average within 4 percentage points of the true probability. Our probabilities are created by users buying and selling shares of a market.
In the 2022 US midterm elections, we outperformed all other prediction market platforms and were in line with FiveThirtyEight’s performance. Many people who don't like betting still use Manifold to get reliable news.
ṀWhy use play money?
Mana (Ṁ) is the play-money currency used to bet on Manifold. It cannot be converted to cash. All users start with Ṁ1,000 for free.
Play money means it's much easier for anyone anywhere in the world to get started and try out forecasting without any risk. It also means there's more freedom to create and bet on any type of question.
© Manifold Markets, Inc.TermsPrivacy