I was browsing Twitter, and I saw a post by Karpathy positively talking about ChatBot Arena, which is a platform for ranking LLMs based on human ratings. As expected, OpenAI is holding positions 1, 2, and 3. I wonder which company will be #1 at the end of 2024.
Screenshot of the rankings table taken on the 13th of December:
@traders Based on the comments below, I think it makes sense to resolve this question based on the ELO rating in case of a tie in "rank." When I created this question, a tie was not an option, so I doubt anyone even traded based on this assumption.
I created a similar question that only uses the rank. Feel free to trade on it.
@NoahRich i don’t think it is that interesting, if anything it shows google is out of the race for spot #1 this year. openai will pass them with the next minor update to 4o. they won’t even need to release a new model to pass google.
@Soli both could release another minor update in the time. there have even been reports shared here previously suggesting the potential lol
besides its interesting that google in its own has even reached this point. up until now they have been pretty far off. especially given their compute potential. interesting if they are finally starting to make use of their leg up in funding potential and compute
@NoahRich google reached this point already in july/august when they were ranked #1 for 1-2 weeks (see this other market that resolved yes) so imo no new information here that would be relevant for 2024 since there is a still a large enough gap between openai and google that can’t be closed this year. However for 2025 it is a different story and Google indeed might fully catch up instead of being 1-2 months behind.
@Soli Must've been right before I joined Manifold then! I joined in late August I think and at that time OpenAI was already leading. Thanks for sharing.
Elon you need to try harder. Your enemies have figured out distributed training. You need to go faster.
Google reportedly releasing in December
i wish i had an even bigger position on openai can someone please buy some no shares
@PeterBuyukliev i don't think they will add the preview model because you can easily infer its o1 by the time it takes to respond compared to the other models which will bias the whole evaluation and ruin the idea behind LMSYS
@PeterBuyukliev but maybe o1-mini will appear on the leaderboards since it is relatively fast and if it does then yes should count, same way the google gemini api searches the web before responding
@PeterBuyukliev ok no both models will be included on the leadeboard according to a tweet by LMSYS and they seem to have added a 30 sec latency for both models when one is o1 which i think is not enough to avoid bias :(
@NoahRich Bought in Google because I think its position at the time didn't reflect its real potential odds of winning.
@NoahRich IDK, Gemini feels very lame and always trailing behind the others. Something is broken is Google, I doubt they can deliver out of nowhere.
@ICRainbow I don't think it's "likely" per say, but I think it's more likely than the current odds would have us believe here on this market. If I check the Chatbot Arena responses, too....
not as big of a difference as I would've expected, as I too have generally found Gemini to be very lackluster in comparison to GPT
@NoahRich Yeah, I've seen those. I'm also a paid user of Gemini Advanced Pro Ultra Whatevs. Claude smokes it hands down for free.