Is GPT-4 a mixture of experts?


It's definitely an ensemble model, but I don't think it's a mixture of experts (i.e., I believe it consistently accesses all subnetworks without gating).

Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.

Shai Natapov

Your markets are often very vague, which is why I avoid betting

Isaac King

@ShaiNatapov I think that's intentional, Gigacasting seems to enjoy controversy.

Isaac King

And given how many traders they frequently get, I guess clear resolution criteria aren't something most traders actually care about? I find that weird too.

Forrest Taylor

What does MoE mean?

@ForrestTaylor The model includes something like this:

@ForrestTaylor Mixture of Experts

Robin Green

What does this mean? Basically a bunch of experts in a trenchcoat, pretending to be an AI? Don't be silly.