Will the leading LLM at the beginning of 2026 still be subject to the reversal curse? | Manifold

Will the leading LLM at the beginning of 2026 still be subject to the reversal curse?

47

Ṁ1kṀ3.1k

Jan 2

36%

chance

1H

6H

1D

1W

1M

ALL

This paper shows that current LLMs that are fine-tuned on f(x) = y will often fail to generalize to f-inverse(y) = x. Gary Marcus seems to think this is a fundamental problem in the current approach to AI.

I tested this myself and can confirm that ChatGPT has this problem.

At the beginning of 2026 I'll try something similar with the leading language model of the time. (Not the fine-tuning, just testing facts in its main training run via its public interface.) If there's at least one example where it consistently gets that f(x) = y and consistently does not get that f-inverse(y) = x, this resolves YES. If I can't find such an example, it resolves NO.

It's not allowed to search the internet or write code, it has to be the model itself doing the reasoning in both directions. If the leading model doesn't allow me to avoid those, I'll use the next-best one. It is allowed to use CoT however.

If the best general purpose AI is not longer an LLM at all, this resolves N/A.

Market context

Get

1,000

to start trading!

People are also trading

Will the highest-scoring LLM on Dec 31, 2026 show <10% improvement over 2025's best average benchmark performance?

Will LLMs Daydream by EOY 2026?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Will there by a major breakthrough in LLM continual learning before 2027?

Will LLMs become a ubiquitous part of everyday life by June 2026?

At the beginning of 2028, will LLMs still make egregious common-sensical errors?

Will the most advanced LLM stop being from a US-based company any time before 2030?

Will there be any major breakthrough in LLM continual learning before 2029?

Will there be any major breakthrough in LLM continual learning before 2028?

Will a frontier-level diffusion LLM exist by 2028?

Sort by:

4o gets the Mary Lee Pfeiffer example correct now, even without searching the web.

Does this only apply to leading large language models? I.e., if other architectures for SOTA general purpose AI appear that are no longer considered language models (perhaps because that's no longer their primary training task, or because they're no longer Transformer based), would you check those models rather than the best LLMs?

@Vergissfunktor I'll resolve N/A in that case.

@IsaacKing I assume reasoning models like o3 will count as LLMs?

People are also trading

Will the highest-scoring LLM on Dec 31, 2026 show <10% improvement over 2025's best average benchmark performance?

Will LLMs Daydream by EOY 2026?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Will there by a major breakthrough in LLM continual learning before 2027?

Will LLMs become a ubiquitous part of everyday life by June 2026?

At the beginning of 2028, will LLMs still make egregious common-sensical errors?

Will the most advanced LLM stop being from a US-based company any time before 2030?

Will there be any major breakthrough in LLM continual learning before 2029?

Will there be any major breakthrough in LLM continual learning before 2028?

Will a frontier-level diffusion LLM exist by 2028?

Related questions

Will the highest-scoring LLM on Dec 31, 2026 show <10% improvement over 2025's best average benchmark performance?

Will LLMs Daydream by EOY 2026?

Will an LLM improve its own ability along some important metric well beyond the best trained LLMs before 2026?

Will there by a major breakthrough in LLM continual learning before 2027?

Will LLMs become a ubiquitous part of everyday life by June 2026?

At the beginning of 2028, will LLMs still make egregious common-sensical errors?

Will the most advanced LLM stop being from a US-based company any time before 2030?

Will there be any major breakthrough in LLM continual learning before 2029?

Will there be any major breakthrough in LLM continual learning before 2028?

Will a frontier-level diffusion LLM exist by 2028?

© Manifold Markets, Inc.•Terms•Privacy