Will Meta release any same-size LLaMa that performs better at MMLU before October 14th 2024?
52
4kṀ68k
resolved Oct 15
Resolved
NO

Will any model that performs better than the equivalent size (± 10% in parameter count) LLaMa 3.1 model be officially released by Meta, where "performs better" means "at least 0.5% more accurate at MMLU"? Base model only.

For example, LLaMa 3.1 70B's MMLU score is 83.6% (an improvement over LLaMa 3.0 70B's 79.5% MMLU). LLaMa 3.2 70B would need to perform at 84.1% MMLU to resolve this market YES. Note that any model in the family (8B, 70B, 405B) performing at least 0.5% better is enough to resolve this market.

Multimodal models eligible but only text MMLU performance will be evaluated. Models that were fine-tuned, DPO'd, RLHF'd, or CPT'd on synthetic data will not resolve this market.

For reference, LLaMa 3.0 70B's MMLU score was 79.5, GPT-4o's score is 88.7, and LLaMa 3.1 405B base's score is 85.2. (LLaMa 3.1 405B Instruct score's is 88.7)

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ5,079
2Ṁ4,216
3Ṁ1,475
4Ṁ888
5Ṁ828
© Manifold Markets, Inc.TermsPrivacy