Is Google Gemini 1.0 Ultra a Mixture of Experts?
Dec 6
Pretty sure that Gemini 1.0 is not MoE, but 1.5 is. Would perfectly explain comparative performance and "53% increase in parameters" yet "using less compute".

Gemini 1.5 boasts a remarkable 53% increase in parameters compared to Gemini 1, empowering it to handle more complex language tasks and store a wealth of information. (source: random linkedin post)

It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute. (source: Google)

You are saying Gemini 1, right?

@Sss19971997 yes, Gemini 1

No previous Google model was a MoE. There was no hint of MoE in the paper. The fact that they trained differently-sized versions of the model (Nano/Pro/Ultra) feels like evidence against: that's hard to do with experts.

How will this be resolved? Does Google have to officially confirm it, or is plausible leak enough?

@Coagulopath yeah all the speculation I can find is saying it’s not MoE.

If there’s no official statement then this resolves based on the most plausible leaks/speculation 1 year after launch (Dec 6, 2024), mods can reresolve if Google confirms it after that point

