Will Reflection Llama 3.1 70B be proven to beat Llama 3.1 405B Instruct in GPQA by the end of September 2024?
Mini
8
Ṁ993
Sep 30
2%
chance

Matt Shumer announced the "world’s top open-source model" on Twitter nearly three days ago and AI Twitter has been going off ever since.

This question specifically uses GPQA as it is the only non-saturated eval in Shumer's original post. Acceptable evidence will be either Hugging Face's Open LLM Leaderboard v2 or other highly credible sources for evaluation data.

Get Ṁ1,000 play money
Sort by:

Reflection accused of being a fraud: https://unrollnow.com/status/1832933747529834747