Will adding an Attention layer improve the performance of my stock trading model?
Will replacing LayerNorm with something that doesn't use current vector statistics remove outlier channels?
Will loss curves on Pythia models of different sizes trained on the same data in the same order be similar?
Will geometric superposition shapes/configs from the ReLU output model appear in the residual stream of LLMs?
Will dual n-back work to improve working memory?
Does the apparent phase change observed in features/neurons have any connection to phase changes in compressed sensing?
By EOY 2025, will the model with the lowest perplexity on Common Crawl will not be based on transformers?
are LLMs easy to align because unsupervised learning imbues them with an ontology where human values are easy to express
Does the Q in Q* stand for either Q-Learning or Q-Values
Algebraic value editing works better for larger language models, all else equal
How many FLOPs will go into training the first ASL-3 model?
Will a transformer circuit be found for predicting the correct indentation level for a new line in python this year?
Are Mixture of Expert (MoE) transformer models generally more human interpretable than dense transformers?
When will a language model be fine-tuned via self-play or expert iteration and achieve significant performance increase?
Will my custom optimizer (Adalite) outperform Adam on evaluation loss in more than 1 of my tests?
If LMs store info as features in superposition, are there >300K features in GPT-2 small L7? (see desc)
Do you think Mixture of Expert (MoE) transformer models are generally more human interpretable than dense transformers?
Do transformer language models prefer superposition even when number of neuron dimensions available > input features?
Will softmax_1 solve the 'outlier features' problem in quantization?
Will mechanistic/transformer interpretability [eg Neel Nanda] end up affecting p(doom) more than 5%?