[Carlini questions] Will we still use (slight modifications of) transformer-based LLMs we currently use
3
100Ṁ70
2030
80%
On Jan 1st 2027
52%
On Jan 1st 2030

Full question: "chance that the best AI system will be recognizable as (a slight modification of) the transformer-based Large Language Models (LLMs) we use in 2024"

Resolution Criteria:

I will evaluate this in the same way that the best current models are basically recognizable as scaled up versions of the transformers we used in 2017. Obviously there have been improvements to the architecture, the training methodology, the data, the fine-tuning process, etc. But the core idea of a modern transformer like Llama-3 (an open source model that matches GPT-4's performance) would be recognizable to someone in 2017. In this same sense, I'm asking if the core idea behind the best model in 2027/2030 will be recognizable as an improved Llama. (Or will we be using some wildly new architecture?) I would consider diffusion, state-space, or other models that are not transformers to be "new architectures".

Motivation and Context:

For the last seven years, every state-of-the-art language model has been a "transformer" language model as introduced in the paper "Attention is All You Need" by Vaswani et al. Other architectures have been tried, but none have caught on in quite the same way. Will this continue to be the case? Or will we discover some new architecture that is much better?

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules