
Current large language models (LLMs) are capable in many rich language tasks, but remain prone to a failure mode known as hallucination: models confidently output false "facts" as though they were true.
One view might be that this is an inherent feature of LLMs based on the current architecture of next-token prediction, since the model has no (direct) notion of overconfidence. Such a view might not expect scaling current models to significantly reduce hallucination.
Another perspective might expect larger models to develop emergent capabilities which reduce or eliminate hallucination. If a knowledgeable human would respond to a hallucination-inducing question with something like "the question does not make sense", we might expect a capable language model to learn similar patterns.
This market resolves to TRUE if I am shown:
Compelling evidence of robustly low levels of hallucination....
From a model which does not introduce any as-yet unused techniques for eliminating hallucination...
Before the market closes on January 1, 2027.
I'll define "robustly low levels of hallucination" as "hallucinates false facts in fewer than <0.1% of responses for a set of difficult questions", or comparable levels of evidence. I'll define "as-yet unused techniques" as techniques which are not currently used by any major LLM model. Solutions such as "do far more RLHF specifically on hallucination-inducing inputs" would not count as a new technique.
Market resolves to FALSE if no such evidence is produced by the market close date, OR if a compelling proof of the infeasibility of solving hallucination through scale alone is given. Such a proof must have reasonably wide acceptance in the machine learning community.
Apr 2, 9:13pm: Will scale alone be enough to eliminate LLM hallucination? → Will scaling current methods be enough to eliminate LLM hallucination?