Will GPT-4 have at least 100 trillion parameters?
81
329
1.6K
resolved Nov 17
Resolved
N/A

This will resolve TRUE if GPT-4's announcement states that it has at least 100 trillion parameters.

For context, GPT-3 has 175 billion parameters.

See also:

Get Ṁ200 play money
Sort by:
predicted NO

Sorry for the delay here, and for writing the market poorly. The title "Will GPT-4 have at least 100 trillion parameters?" is quite different from my description "GPT-4's announcement states that it has at least 100 trillion parameters."

OpenAI has not revealed the number of parameters, though George Hotz has claimed it's 1.7 trillion. So according to the title, this market may resolve N/A by the time it closes, if it remains without an authoritative source.

According to my description, it would resolve NO, since OpenAI didn't state the number of parameters when they announced GPT-4. But traders may not have reasonably assumed that the resolution criteria in the description followed from the title. Conditional on GPT-4 having over 100 trillion parameters, the probability that OpenAI would announce that when introducing GPT-4 is plausibly small.

Given the ambiguity, I am resolving this N/A.

@MaxGhenis Can this resolve?

predicted NO

@MaxGhenis This market is supposed to be conditional on the value stated in the anouncment. Unfortunatly it seems the announcment does not contain those architecture details. Resolve N/A?

predicted NO

@tinytitan Technically correct, the best kind of correct.

@tinytitan Why would it be N/A?

if GPT-4's announcement states that it has at least 100 trillion parameters

There was an announcement, right? Did it state it or not? (As best I'm aware, it did not.)

bought Ṁ10 of NO

Should be lower than 10%...should be 1% or less. The cost for 100T in 2023 would likely be equivalent to the total US military spending in Ukraine to date. I don't see a company ponying up that money. Now, if a OpenAI comes out and redefines what a parameter is for marketing purposes, or if the say some bullshit like, "precision parameters," which is not the same thing and then everyone's mind is blown because they called a cat a dog...it ain't happening. Now...go forward another 4 years and the cost might be closer to $200M instead of $20B so I could see it happening in 2027 or after since Megatron-Turing NLG 530B cost Microsoft $100M's.

Manifold in the wild: A Tweet by Max Ghenis

@AlexHormozi https://manifold.markets/MaxGhenis/will-gpt4-have-at-least-100-trillio?referrer=MaxGhenis

bought Ṁ10 of YES

Buying Yes because the potential upside is so big

sold Ṁ91 of NO

I think this is mostly a question about if GPT-4 will be a MoE
(https://manifold.markets/vluzko/will-gpt4-be-a-dense-model), and then how many experts a GPT-4 model would have.

The largest MoE model in the GShard paper has 2048 experts per expert-layer. https://download.arxiv.org/pdf/2006.16668v1

If a 300B parameter model had 2048 experts per layer, that would be ~600T parameters
If a 175B parameter model had 2048 experts per layer, that would be ~360T parameters

GPT-3's batch size was 500K (https://arxiv.org/pdf/2005.14165.pdf), so 2K experts wouldn't be mad, esp. with better hardware.

Given that the odds GPT-4 will be a MoE seem ~30% to me, and I think that the odds are that it would cross the 100T threshold 35% of the time, I think the odds that GPT-4 will have >100T parameters are ~11%.

predicted YES

@NoaNabeshima I feel most uncertain about the probability it crosses the 100T threshold conditional on GPT-4 being a MoE, so I think this estimate could be improved if someone thought about that

predicted YES

@NoaNabeshima in particular it seems plausible my probability should be higher

predicted YES

@NoaNabeshima but also lower, eh

sold Ṁ2 of YES

@NoaNabeshima oh also I'm incorrectly assuming that the number of parameters scales linearly with the number of experts but actually probably only the feedforward layers would be duplicated, making the parameter scaling factor be (1+E)/2 instead of E, where E is the number of experts.

Additionally, not every feedforward layer needs to be a mixture-of-experts layer. EG in GShard only half of them are mixture-of-experts layers. So if P of the feedforward layers are mixture of experts, the parameter scaling factor would be

(1+(P*E+(1-P))/2

For P = 0.5, and 2048 experts, that's a scaling factor of 512x, so only 89T parameters for a 175B parameter base model.

predicted NO

@NoaNabeshima

If E is large,

(1+(P*E + (1-P)))/2 ~= P*E/2

Started one that uses a numeric answer:

In an August 2021 Wired article, Andrew Feldman, founder and CEO of Cerebras, said, "From talking to OpenAI, GPT-4 will be about 100 trillion parameters."

According to Alberto Romero, though, "not much later, Sam Altman, OpenAI’s CEO, denied the 100T GPT-4 rumor in a private Q&A."

Will GPT-4 have at least 100 trillion parameters?, 8k, beautiful, illustration, trending on art station, picture of the day, epic composition