Will GPT-4 be trained (roughly) compute-optimally using the best-known scaling laws at the time?
40
1kṀ6753
Jun 2
30%
chance

This question resolves YES if GPT-4 has enough data to roughly match the best-known scaling laws prescriptions known at the time of the training of GPT-4. Currently, this would mean following Chinchilla scaling laws. By roughly, I mean that it can be off by 20%. That is, if GPT-4 is 100B parameters, which would prescribe 12T tokens as per (currently known) optimal scaling laws, GPT-4 would need to be trained from ~10T to ~14T tokens for this question to resolve positively.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy