Do transformer language models prefer superposition even when number of neuron dimensions available > input features?
10
210Ṁ940resolved Jan 1
Resolved
YES1H
6H
1D
1W
1M
ALL
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
🏅 Top traders
# | Name | Total profit |
---|---|---|
1 | Ṁ73 | |
2 | Ṁ1 | |
3 | Ṁ0 | |
4 | Ṁ0 |
People are also trading
Related questions
If LMs store info as features in superposition, does # features scale superlinearly with number of model parameters?
41% chance
Will Transformer based architectures still be SOTA for language modelling by 2026?
79% chance
Are Mixture of Expert (MoE) transformer models generally more human interpretable than dense transformers?
45% chance
Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?
63% chance
Will superposition in transformers be mostly solved by 2026?
73% chance
Do you think Mixture of Expert (MoE) transformer models are generally more human interpretable than dense transformers?
POLL
If LMs store info as features in superposition, are there >300K features in GPT-2 small L7? (see desc)
59% chance
Will we find polysemanticity via superposition in neurons in the brain before 2040?
64% chance