How many parameters will GPT-4 have?

3.3kṀ33k

resolved May 12

100%94%

>1600

0.0%Other

0.0%

300

0.1%

350-400

0.0%

401-450

0.0%

451-500

0.0%

501-550

0.0%

551-600

0.0%

601-700

0.0%

701-800

0.0%

801-1000

0.2%

1001-1200

0.2%

1201-1400

0.6%

1401-1600

801-1600

0.4%

301-349 or <300

0.1%

1200B exactly

1.0%

1000B-1400B

0.1%

Yes

GPT-3 has a staggering 175 BILLION parameters

To put it into context
Hugging face's 176 Billion parameter model took 3.5 months on 384 top-of-the-line GPUs to train...

GPT-3 is also over 2 years old.

Nov 17, 1:56am: ~~How many parameters with GPT-4 have?~~ → How many parameters will GPT-4 have?

Science

GPT-4 speculation

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ2,055
2		Ṁ1,939
3		Ṁ780
4		Ṁ768
5		Ṁ632

People are also trading

Will GPT-5 have fewer parameters than GPT-4? (1500M subsidy)

20% chance

Will GPT-5 have over 10 trillion parameters?

28% chance

Will GPT-5 have over 100 trillion parameters?

4% chance

Will GPT-5 have over 1 trillion parameters?

91% chance

What will be true about GPT-5?

In yottaFLOPs (10^24), how much compute will GPT-4 be trained with?

Sort by:

creator inactive - resolving to >1600

@shankypanky just for curiosity, what source did you use?

@Lorenzo fair question! as this market - Will GPT-4 have over 1 trillion parameters? - resolved yes, >1600 is the only suitable option here

Why there's a 801-1600 option, and 4 separate 801-1000, 1001-1200, 1201-1400, 1401-1600 options? How will this resolve if iGPT-4 has, say, 1050 parameters? Both the wide and the narrow options will be chosen? With which weights?

This market is weird.

@MayMeta GPT-4 recommends weighting by the reciprocal of the interval length.

i.e. The 800-wide entry should get a weight of 1/800 if the answer is in there. A 100-wide entry would get a weight of 1/100, so 8x more.

I also notice that there are answers that overlap with each other @JustinKwong

@NoaNabeshima oops, @JustinTorre

The units for these answers are in B of parameters, right??? @JustinTorre

@NoaNabeshima lmao

If GPT-4 is a MoE model it will probably have >1600B parameters.

Hugging face's 176 Billion parameter model took 3.5 months on 384 top-of-the-line GPUs to train...

Note that HuggingFace's model was trained on 350B tokens. The Chinchilla optimal amount of tokens for a 175B model is 3500B tokens, so 10x as much!