Will softmax_1 solve the 'outlier features' problem in quantization?
9
2.7kṀ1419
resolved Jan 1
Resolved
NO

See this blog post: https://www.evanmiller.org/attention-is-off-by-one.html, and in particular this paragraph:

Even though softmax1 is facially quite boring, I’m 99.44% sure that it will resolve the outlier feedback loop that’s making quantization the subject of cascades of research. If you want to run some experiments and prove me right, DM me on Twitter and we’ll get a paper going.

Get
Ṁ1,000
to start trading!

🏅 Top traders

#NameTotal profit
1Ṁ25
2Ṁ13
3Ṁ5
4Ṁ4
5Ṁ3
© Manifold Markets, Inc.TermsPrivacy