![](/_next/image?url=https%3A%2F%2Ffirebasestorage.googleapis.com%2Fv0%2Fb%2Fmantic-markets.appspot.com%2Fo%2Fdream%252Fd517Ga_5tE.png%3Falt%3Dmedia%26token%3D03bd947f-5323-457c-ad67-2cc573560a3b&w=3840&q=75)
My probability in 2026 that training transformer LMs will eventually lead to inner misalignment issues
Mini
6
Ṁ922026
59%
chance
1D
1W
1M
ALL
Resolves to my probability that the language modelling objective has substantial inner misalignment issues in transformers when scaled up with up to 50 OOM more compute than Chinchilla.
I haven't thought lots about what happens with that much more compute. I'm currently not very worried about inner misalignment risks from GPT models in the next 8 years when 99% of the training compute is for the language modelling objective.
Get Ṁ600 play money
Related questions
Related questions
Will superposition in transformers be mostly solved by 2026?
62% chance
Will any foundation models/LLMs be able to reliably come up with novel unparalleled misalignments before EOY 2024?
45% chance
Will Transformer based architectures still be SOTA for language modelling by 2026?
74% chance
Will deceptive misalignment occur in any AI system before 2030?
67% chance
Will Inner or Outer AI alignment be considered "mostly solved" first?
Will reinforcement learning overtake LMs on math before 2028?
57% chance
Will second-order optimizers displace first-order optimizers for training LLMs by 2030?
38% chance
Conditional on their being no AI takeoff before 2030, will the majority of AI researchers believe that AI alignment is solved?
35% chance
When will a non-Transformer model become the top open source LLM?
End of pre-training era for language models: Will an LM fine-tune for more FLOPs than it is pre-trained for, before 2026
22% chance