Will Reflection Llama 3.1 70B be proven to beat Llama 3.1 405B Instruct in GPQA by the end of September 2024?
Basic
10
Ṁ1198
resolved Sep 30
Resolved
NO

Matt Shumer announced the "world’s top open-source model" on Twitter nearly three days ago and AI Twitter has been going off ever since.

This question specifically uses GPQA as it is the only non-saturated eval in Shumer's original post. Acceptable evidence will be either Hugging Face's Open LLM Leaderboard v2 or other highly credible sources for evaluation data.

Get
Ṁ1,000
and
S1.00
Sort by:

Reflection accused of being a fraud: https://unrollnow.com/status/1832933747529834747