It's definitely an ensemble model, but I don't think it's a mixture of experts (i.e., I believe it consistently accesses all subnetworks without gating).
Given both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar.
Your markets are often very vague, which is why I avoid betting
@ShaiNatapov I think that's intentional, Gigacasting seems to enjoy controversy.
And given how many traders they frequently get, I guess clear resolution criteria aren't something most traders actually care about? I find that weird too.
What does MoE mean?
@ForrestTaylor The model includes something like this: https://arxiv.org/abs/1701.06538
@ForrestTaylor Mixture of Experts
What does this mean? Basically a bunch of experts in a trenchcoat, pretending to be an AI? Don't be silly.