Will transformer architectures lose their dominant position in deep learning before 2028?
5
Ṁ100Ṁ145
2027
16%
chance

Resolution criteria

This market resolves YES if, by December 31, 2027, transformer architectures no longer hold the dominant position in deep learning across the majority of major application domains (NLP, computer vision, multimodal learning, and time series forecasting). Dominance is defined as being the most widely adopted and state-of-the-art architecture for most of these domains, as evidenced by:

  • Leading performance on major benchmarks (e.g., MMLU, ImageNet, COCO)

  • Prevalence in deployed production systems and published research

  • Adoption by major AI labs and companies

The market resolves NO if transformers remain the dominant architecture in most major domains. Hybrid architectures that incorporate transformer components (e.g., transformer-SSM hybrids) do not count as a loss of dominance for transformers.

Background

Transformers have emerged as the leading architecture in deep learning and have become the de-facto standard model in artificial intelligence since 2017. When paired with self-supervised training on extensive datasets, they have achieved top performance across numerous benchmarks in NLP and computer vision.

Alternatives to transformers include sub-quadratic attention variants, Recurrent Neural Networks (RNNs), State Space Models (SSMs), and hybrids thereof. The Mamba architecture, a refinement of state space sequence models, provides an alternative that can accommodate very long input sequences while processing them more efficiently. Some authors note that using a mixture of recurrent blocks and multihead attention blocks improves quality; examples include StripedHyena-Hessian-7B (a hybrid of attention and state space models), Griffin (alternating between recurrent blocks and local multi-query attention), and Jamba (combining Mamba layers, Transformer layers, and Mixture-of-experts).

Considerations

While transformers remain dominant, alternatives are finding footholds in specific use cases and operational niches, though at the frontier, full attention is likely to remain central for the foreseeable future. The shift is not toward replacement, but toward building flexible systems from a growing set of specialized primitives, with model routing and Mixture of Architectures (MoA) paradigms becoming more relevant. It is unclear whether specialized architectures might substantially outperform the general transformer in certain sets of tasks, as the transformer seems to violate the "no free lunch theorem".

Market context
Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy