When will the next Large Language Model be released that has an LMSYS Arena Elo of at least 1334; 75 points better than the current leader?
As of the 22nd of April, 2024 there are 4 models with an Arena Elo between 1249 and 1259 according to the LMSYS leaderboard: 3 versions of GPT-4 and 1 version of Claude 3 Opus. The highest rated GPT-3.5-Turbo version has an Elo of 1119, 46 points behind the lowest GPT-4 version (0613 for both), while the 0314 versions of these models have an Elo gap of 82 points. Thus, a 75 point gap would represent a breakthrough and a new generation of LLMs.
Elo will be evaluated 1 week after the dates shown. If 1334 is within the top contender's confidence interval, l'll wait 1 additional week and resolve based on the Elo then.