Who will release the next Large Language Model that has an LMSYS Arena Elo of at least 1334; 75 points better than the current leader?
As of the 22nd of April, 2024 there are 4 models with an Arena Elo between 1249 and 1259 according to the LMSYS leaderboard: 3 versions of GPT-4 and 1 version of Claude 3 Opus. The highest rated GPT-3.5-Turbo version has an Elo of 1119, 46 points behind the lowest GPT-4 version (0613 for both), while the 0314 versions of these models have an Elo gap of 82 points. Thus, a 75 point gap would represent a breakthrough and a new generation of LLMs.
Elo will be evaluated 1 week after the model enters the leaderboard. If 1334 is within the top contender's confidence interval, l'll wait 1 additional week and resolve based on the Elo then. If multiple models meet the criteria, the earliest release date is the winner.
๐ Top traders
# | Name | Total profit |
---|---|---|
1 | แน186 | |
2 | แน89 | |
3 | แน69 | |
4 | แน45 | |
5 | แน35 |