Will we see improvements in the TruthfulQA LLM benchmark in 2024?

Daron Acemoglu wrote an article with a series of vague AI predictions for 2024 https://web.archive.org/web/20240110122026/https://www.wired.com/story/get-ready-for-the-great-ai-disappointment/.

One of which is: "More and more evidence will emerge that generative AI and large language models provide false information and are prone to hallucination—where an AI simply makes stuff up, and gets it wrong. Hopes of a quick fix to the hallucination problem via supervised learning, where these models are taught to stay away from questionable sources or statements, will prove optimistic at best. Because the architecture of these models is based on predicting the next word or words in a sequence, it will prove exceedingly difficult to have the predictions be anchored to known truths."

We have a benchmark with truthfulness of questions called TruthfulQA. The highest scoring model in 2023 was GPT-4 at 0.59. Will we see any improvement in this benchmark in 2024?

This is the best link I could find with different models run on the TruthfulQA benchmark, but am open to other sources if they exist https://paperswithcode.com/sota/question-answering-on-truthfulqa

Get Ṁ600 play money

More related questions