Will a big transformer LM compose these facts without chain of thought by 2026? (harder question version)

1kṀ8455

2026

43%

chance

ALL

This is a version of this market with a harder composition question, in case the crux is about the difficulty of the question: https://manifold.markets/LeoGao/will-a-big-transformer-lm-compose-t?r=TGVvR2Fv

The question for this market is "What is the name of the element with an atomic number equal to the sum of the age at which Euler died and the number of faces on a cube?" (Please don't post the answer in the comments to avoid it making its way into a training set.)

All the other rules are the same as the referenced market.

Close date updated to 2026-01-01 3:59 pm

Jan 19, 9:14am: ~~Will a big transformer LM compose these facts without chain of thought by 2026? (hard mode)~~ → Will a big transformer LM compose these facts without chain of thought by 2026? (harder question version)

LLMs

Get

1,000

to start trading!

People are also trading

Will a big transformer LM compose these facts without chain of thought by 2026?

53% chance

Will superposition in transformers be mostly solved by 2026?

73% chance

Will the transformer architecture be replaced in SOTA LLMs by 2028?

61% chance

Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?

63% chance

Before Feb 2026, will a transformer based reasoning model >1800 elo be able to explain 3+ chess lines at any position?

55% chance

Will transformers still be the dominant DL architecture in 2026?

81% chance

Will Transformer based architectures still be SOTA for language modelling by 2026?

80% chance

Will it become possible to use transformer action AIs to DDOS anyone you want by 2025?

20% chance

By the start of 2026, will I still think that transformers are the main architecture for tasks related to natural language processing?

75% chance

Will a transformer based model be SOTA for video generation by the end of 2025?

Sort by:

bought Ṁ1,000 YES

Mark as resolved

@Markdawg You’re probably joking but in case it isn’t clear to others: these new models are doing chain-of-thought under the hood.

https://arxiv.org/pdf/2402.16837

predictedNO

GPT-4 is able to do multi-step math without chain of thought. This fact composition thing seems like it happens a lot less often than multi-step math in text.

One algorithm it could learn is

"get the answer to the facts internally, then combine them to get the answer using existing multi-step math circuitry" which seems kinda reasonable, though internal fact numbers are probably not stored like token numbers are currently.

I wonder if you can fermi estimate when these capabilities will arrive with one input being the number of occurrences of a pattern in web text and another being how much a capability would improve the loss on average in those occurrences. You'd have to see if your estimation method could predict past advances.

ChatGPT already comes pretty close, so I think this market is underpriced.

@datageneratingprocess Jus saw the rules of the referenced market, my ChatGPT result doesn't count.

@datageneratingprocess I tested this with ChatGPT 4 just now, it got it wrong and then claimed it had got it right until I asked the atomic number of the correct element (and after that, until I pointed out the contradiction.)

I suppose with more trials it might guess correctly though.