Will a Mamba 7b model trained on 2 trillion tokens outperform Llama2-13B

21

1kṀ738

resolved Aug 23

Resolved

NO

1H

6H

1D

1W

1M

ALL

Question will resolve positive if someone trains a Mamba (https://twitter.com/tri_dao/status/1731728602230890895) language model with <=7.5billion parameters on <=2 trillion tokens that outperforms Llama2-13B on the huggingface open llm leaderboard (https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ97
2		Ṁ95
3		Ṁ19
4		Ṁ14
5		Ṁ2

Sort by:

https://huggingface.co/nvidia/mamba2-hybrid-8b-3t-4k

People are also trading

Will anyone train a TokenFormer model at scale before 2026?

Will a flagship (>60T training bytes) open-weights LLM from Meta which doesn't use a tokenizer be released in 2025?

Will the next major LLM by OpenAI use a new tokenizer?

Will a single model running on a single consumer GPU (<1.5k 2020 USD) outperform GPT-3 175B on all benchmarks in the original paper by 2025?

First model series to cross 1500 on lmarena.ai?

Llama 5 outperforms GPT 4o on LM Arena?

Meta trains model 2x larger than Behemoth in llama 4 series?

Before 2028, will any AI model achieve the same or greater benchmarks as o3 high with <= 1 million tokens per question?

Related questions

Will anyone train a TokenFormer model at scale before 2026?

Will a flagship (>60T training bytes) open-weights LLM from Meta which doesn't use a tokenizer be released in 2025?

Will the next major LLM by OpenAI use a new tokenizer?

Will a single model running on a single consumer GPU (<1.5k 2020 USD) outperform GPT-3 175B on all benchmarks in the original paper by 2025?

First model series to cross 1500 on lmarena.ai?

Llama 5 outperforms GPT 4o on LM Arena?

Meta trains model 2x larger than Behemoth in llama 4 series?

Before 2028, will any AI model achieve the same or greater benchmarks as o3 high with <= 1 million tokens per question?

© Manifold Markets, Inc.•Terms•Privacy