When will a non-Transformer model become the top open source LLM?

1kṀ1335

2030

20%

In 2025 or earlier

60%

In 2026 or earlier

63%

In 2027 or earlier

65%

In 2028 or earlier

68%

In 2029 or earlier

70%

In 2030 or earlier

In 2024 the field of AI language models is dominated by Transformers. Many research papers are suggesting alterations and newer models, but none of them has successfully competed with Llamas and their open source friends.

The market will resolve positively as soon as the first place on the Hugging Face Open LLM Leaderboard (https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) is taken by a non-Transformer model.

If Hugging Face disappears or falls into obscurity by the time this market is resolved, another similar ranking will be used.

The following modification to Transformer model are not enough to consider is a new model for the purposes of this question:

Changes to activation function
Extra dense layers
Changes to normalization/dropout/...
Changes to the number of heads/keys/queries etc.
Minor changes to how attention components are calculated (e.g. adding bias or calculating some sort of non-linearity)
Using Transformers in an ensemble
Other changes such that Wikipedia still categorizes the new model as a Transformer

An attention-based model in which attention is applied not to pairs of positions, but to some other domain, especially if it lead in significant improvement in efficiency, qualify as significant change for the purposes of this question.

I do not bet on my own questions.

Technology

Get

1,000

to start trading!

People are also trading

Will Transformer based architectures still be SOTA for language modelling by 2026?

80% chance

In December 2025, a model by which company will be the most used on OpenRouter LLM Rankings?

Will Transformer-Based LLMs Make Up ≥75% of Parameters in the Top General AI by 2030?

50% chance

OpenAI to release model weights by EOY?

90% chance

When will an open-source LLM be released with a better performance than GPT-4?

Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?

63% chance

Which High-risk threshold as defined by OpenAI will be reached first by an LLM, whether or not that LLM is released?

4 Comments

24 Holders

69 Trades

Sort by:

In the 90s an incredible amount of effort went into doing 3D occlusion using clever approaches that avoided the gargantuan memory cost of a floating point depth buffer. Now we just pay for a depth buffer.

We're just gonna pay N^2 compute (and linear memory)

@HastingsGreer N^2 might be a tough call if you want it to write a novel, or simulate a person living for decades.

Also below N^2 scaling is not a necessary condition for this market to resolve positively.

bought Ṁ10 NO

@OlegEterevsky

N^2 might be a tough call if you want it to write a novel, or simulate a person living for decades.

Gemini 1.5's 1M context window convinced me transformers can scale pretty well for that.

@singer I don't know the technical details, but I'm assuming it's not technically a "traditional" Transformer.

People are also trading

Will Transformer based architectures still be SOTA for language modelling by 2026?

80% chance

In December 2025, a model by which company will be the most used on OpenRouter LLM Rankings?

Will Transformer-Based LLMs Make Up ≥75% of Parameters in the Top General AI by 2030?

50% chance

OpenAI to release model weights by EOY?

90% chance

When will an open-source LLM be released with a better performance than GPT-4?

Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?

63% chance

Which High-risk threshold as defined by OpenAI will be reached first by an LLM, whether or not that LLM is released?

People are also trading

People are also trading

Related questions