Do scaling laws happen because models experience a ton of tiny phase changes which average out to a smooth curve?
30
1kṀ1229
2030
61%
chance

Problem 5.31 from @NeelNanda's 200 COP.

"D* 5.31 -  Hypothesis: The reason scaling laws happen is that models experience a ton of tiny phase changes, which average out to a smooth curve because of the law of large numbers. Can you find evidence for or against that? Are phase changes everywhere?"

Resolves to the best evidence available by the end of 2030.

Get
Ṁ1,000
to start trading!
Sort by:

Suppose there are lots of phase changes but also there are smooth changes: the model doesn't immediately get the exact skip trigram frequencies, or even the best local circuit for encoding a single skip trigram, it gradually makes these more precise over time. If smooth changes and ?? other changes ?? account for 20% of the loss change per step, and faster, almost-discrete-looking phase changes account for 80%, how will this resolve?

@NoaNabeshima How low can the discrete-looking phase changes contribution go before this doesn't resolve Yes?

Comment hidden
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules