Will LLM hallucinations be a fixed problem by the end of 2025?
Basic
478
141k
2026
22%
chance

https://fortune.com/2023/08/01/can-ai-chatgpt-hallucinations-be-fixed-experts-doubt-altman-openai/

“This isn’t fixable,” said Emily Bender, a linguistics professor and director of the University of Washington’s Computational Linguistics Laboratory. “It’s inherent in the mismatch between the technology and the proposed use cases.”

How true will this end up being? At the end of 2025 I will evalaute whether the hallucination problem for LLMs has been fixed or still exists. If hallucinations have been solved this market resolves YES. If the outstanding hallucination problem still exists, this market will resolve NO.

Related market:

Get Ṁ600 play money
Sort by:

DeepMind claims to be making progress on separating epistemic and aleatory randomness, which would solve hallucinations.

https://arxiv.org/pdf/2406.02543

The ContextCite method which looks at the counterfactual logits when you ablate a source seems simple and potentially effective.

https://gradientscience.org/contextcite/

In general, "hallucinations" are caused by the fact that the text prior generalizes by being a stochastic model, in order to generalize it has to model all of the possibilities implied by the data not just the particular timeline in which it finds itself. However, it is possible to constrain the sampling from the text prior with e.g. an embedding as I do in my AdaVAE experiments:

https://www.greaterwrong.com/posts/4Hnso8NMAeeYs8Cta/revealing-intentionality-in-language-models-through-adavae

I would expect that by the end of 2025 someone will have noticed that you can do in-context decoding of an embedding and use this to fit retrieved memories to the local context in a way that minimizes hallucinations. Right now we don't actually pair our decoder-only transformer language models with an encoder, but in principle you could and if there's a lot of pressure to solve hallucinations I don't see why you wouldn't eventually. Since I know the problem is solvable in principle and the incentives to solve it are overwhelming it would be fairly weird to me if nobody did.

@SneakySly Only skimmed the paper, but I'm not very impressed by it as progress towards "solving LLM hallucination". It's not clear to me what about their method is new. They also focus on QA tasks with clearly defined right answers, which their method is relatively easy to apply to. Real world use is often much more open ended.

The nature of LLMs as next-token predictors might make this intractable IMO.

Could you clarify what you mean by "LLM", and the "hallucination problem" ?

As some other people have asked, does an AI with an LLM bootstraped to some other technique count ? How much of the AI needs to be the LLM to meet the resolution criterion ?

As for the hallucination problem, do you mean the specific problem of LLM hallucinations, compared to human hallucination for exemple, or do you mean any degree of hallucination, even if the AI is superhuman at not making stuff up ? Does it count if one can make them hallucinate through prompt engineering, even if they are otherwise hallucinating less than humans ?

@PierreLamotte For example if we get a new ChatGPT that is LLM + other stuff and that solves it, that will count. If a new ChatGPT had no LLM aspect at all (totally new architecture) then that would not count.

Hallucinations I clarified in the linked market with the same criteria but different year:

"
- What if very few everyday users can elicit hallucinations, but adversarial prompts created by experts still can?
Depends on the prompts. I think that if I personally can copy a prompt that follows the definitions already established and elicit a hallucination then the market resolves NO.
"

It's a tough bar!

@SneakySly I gave the following in the other thread:

True, if OpenAI claimed that GPT-X hallucinated 99% less often that should resolve this market as YES. For the spirit of this question that is essentially solving the problem. How do we like operationalizing this as if someone posts a reputable article indicating a 95%+ level of hallucination reduction we can say that hallucinations were not an intractable hurdle like these experts claimed and the market can resolve YES. (Benchmarks would work as well if they get created)

@SneakySly GPT-X compared to what? Compared to release GPT-4? Compared to GPT-3.5? Compared to GPT-(X-1)?

@SneakySly (only now seeing this) fwiw I had not interpreted this market as trying to operationalize hallucinations being an intractable hurdle (which I don't believe).

You can almost always find hallucinations if you're looking for them, even in bleeding-edge LLMs. One pretty reliable method is to take a well-known trick question and modify it to remove the trick. Even the top LLMs will often still respond as if you'd asked the original version of the trick question, rather than what you actually asked. Here's an example from GPT4o:

Memory is flawed. Humans hallucinate all the time just like gpt4

predicts YES

I'm surprised that this market isn't higher, it seems that there has been notable progress on the accuracy of llm-generated content over the last ~1 year, and this seems likely to continue.

Its plausible that hallucinations are an inherent artifact of such stochastic generative systems, but "solving" hallucinations appears to be reducible to a problem of detecting when they occur, and supplementing with external sources or otherwise signaling the uncertainty. Perhaps I'm unclear on the resolution criteria, but as I understand it the 98% accuracy with RAG described by openai would have been sufficient to resolve yes (If it was actually achieved).

There also seems to be some tradeoff between an llm being able to provide a correct answer and not providing incorrect answers, as you can always raise the bar of certainty required for a fact to be stated. I think a system that tells the right answer 80% of the time, is wrong 0.1%, and otherwise explains it does not know is more useful than one that answers correctly 99% and is wrong 1%. If such a ratio as 80%/19.9%/0.1% is plausible with fine-tuning current systems to hedge more frequently, would this pass for hallucinations being solved? If the rate of falsehoods is the primary factor, then I feel it should, as it would be greater than human level.

Its also worth considering that hallucinations are one of primary (if not THE primary?) roadblocks preventing more widespread use of current llms. It seems likely that huge resources will be put into improving their factual accuracy over the next two years.

Is there something in particular that I'm missing?

bought Ṁ100 NO from 33% to 31%

@CollinGray you are describing a machine that can tell if an arbitrary statement is true or not, we can't do that as humans, why do you think it's going to be easy to do with an llm?

Otherwise I agree with you I'd rather have something that's only wrong 0.1% of the time, even if I had to sacrifice some answers that were correct but not confident. I disagree with your numbers though, 80% of the time is not good enough, that would be too low a threshold and would result in the ai only being able to answer trivially true things.

predicts YES

@Odoacre You're right that 80% of questions being answered is too low to be very useful, but my larger point with that was that "solving hallucinations" is a very fuzzy target, as you can always reduce fake answers at the cost of correct answers by rewarding hedging, and so I'm unclear about what a system which fulfills this market would look like.

As for your first point, I agree that determining whether an arbitrary statement is true or not is very difficult, but llms have more information to work with, i.e. a probability distribution over all possible answers. The assumption here is that unlike hallucinated facts, memorized facts are far more self consistent. For example, "Washington" will dominate the next-token probability distribution(s) for "The 1st president was", but "The 200th president was" will result in a distribution with multiple top possibilities. For a better explanation than I can give, check out this paper

This depends largely on how you define an "LLM"

could an LLM plus a bunch of non-LLM infrastructure be forced to forgo all hallucinations?
Possibly

could a different architecture to current day LLMs which actually performs cognition like operations on it's internal knowledge, but which superficially resembles LLMs externally forgo all hallucinations?
Possibly

Could a pure LLM forgo hallucination?
Not with any training dataset which presently exists.

There's many degrees to "fixing" hallucinations. If we're talking about always stating things with 100% accuracy or replying with "unknown", then the LLMs will be restricted to producing mathematical formulas alone as anything else could be potentially attacked as a "hallucination" due to not being precise enough.

If we're talking about being less prone to hallucinations that a human with access to Google, then it's a much more realistic proposition and I would be YES on that.

predicts YES

Relevant - because if they did, then it's more or less solved.

predicts NO

@firstuserhere Eh, it's highly doubtful that getting a high benchmark in that kind of controlled environment is equivalent to solving the problem, imo. Like, I would still be very surprised if it doesn't still hallucinate about things right at the edge of its knowledge context or when it comes to more niche domain-specific questions.

predicts YES

@Sphinxfire That's why i bought it up only by <10%

@firstuserhere Tbh they aren't related. Also already mentioned why that market should just be N/Aed.

Gary Marcus offered to bet $100,000 on this! Seems like an important question, and personally I'm surprised to see this as high as 23% when Marcus is offering to bet so much against it! I'm going to add in a 2000 mana subsidy here as part the Public Interest Subsidy Program.

bought Ṁ200 NO from 30% to 28%

Oh, and I see someone made an even more specific market with a 3 month timeline in response to Hoffman! Arb away!

I have created a new related market set to the end of 2028.

Might depend on