GPT-3 has a staggering 175 BILLION parameters
To put it into context
Hugging face's 176 Billion parameter model took 3.5 months on 384 top-of-the-line GPUs to train...
GPT-3 is also over 2 years old.
Nov 17, 1:56am: How many parameters with GPT-4 have? → How many parameters will GPT-4 have?
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ2,055 | |
2 | Ṁ1,939 | |
3 | Ṁ780 | |
4 | Ṁ768 | |
5 | Ṁ632 |
@Lorenzo fair question! as this market - Will GPT-4 have over 1 trillion parameters? - resolved yes, >1600 is the only suitable option here
Why there's a 801-1600 option, and 4 separate 801-1000, 1001-1200, 1201-1400, 1401-1600 options? How will this resolve if iGPT-4 has, say, 1050 parameters? Both the wide and the narrow options will be chosen? With which weights?
Hugging face's 176 Billion parameter model took 3.5 months on 384 top-of-the-line GPUs to train...
Note that HuggingFace's model was trained on 350B tokens. The Chinchilla optimal amount of tokens for a 175B model is 3500B tokens, so 10x as much!