Will Llama 3 use Mixture of Experts?
29
1kṀ6035
resolved Jul 30
Resolved
NO

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ225
2Ṁ162
3Ṁ141
4Ṁ84
5Ṁ68
Sort by:

@mods

Llama 405b was released on 23 July, and its dense model. Account of market autor is deleted. Could moderator(or its equivalents on Manifold) resolve the market?

let's wait for this

In case both dense and MoE are released under the name of llama 3, I am leaning towards resolving to the architecture that the BEST model uses (lmsys arena)

@Sss19971997 If both dense and MoE are released, I think it should resolve YES.

@ErikBjareholt If they have 640B dense and 16B moe, seems wrong to resolve to MoE

predictedYES

@Sss19971997 Perhaps, but more likely we'll see 8x7B MoE (like Mixtral) and also a 70B dense model.

In that case, do you think this should resolve no?

@ErikBjareholt Depending on the performance. Very likely 8x7b will be worse than a 70B.

Foundational models are not MoE right? MoE is a technique to increase throughput of foundational models.

@quantizor Wrong. MoE gain most benefit from pretraining

@quantizor MoE is applied during pretraining, like a very smart dropout

@HanchiSun Arbitrage opportunity

© Manifold Markets, Inc.TermsPrivacy