Are Mixture of Expert (MoE) transformer models generally more human interpretable than dense transformers?
15
1kṀ1318Dec 31
45%
chance
1H
6H
1D
1W
1M
ALL
This question is managed and resolved by Manifold.
Get
1,000 to start trading!
People are also trading
Related questions
Is gpt-3.5-turbo a Mixture of Experts (MoE)?
84% chance
Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?
63% chance
Will mechanistic/transformer interpretability [eg Neel Nanda] end up affecting p(doom) more than 5%?
10% chance
By EOY 2025, will the model with the lowest perplexity on Common Crawl will not be based on transformers?
10% chance