I was browsing Twitter, and I saw a post by Karpathy positively talking about ChatBot Arena, which is a platform for ranking LLMs based on human ratings. As expected, OpenAI is holding positions 1, 2, and 3. I wonder which company will be #1 at the end of 2024.
Screenshot of the rankings table taken on the 13th of December:
@traders Based on the comments below, I think it makes sense to resolve this question based on the ELO rating in case of a tie in "rank." When I created this question, a tie was not an option, so I doubt anyone even traded based on this assumption.
I created a similar question that only uses the rank. Feel free to trade on it.
Gemini flash 2.0 strawberry in the api
https://ai.google.dev/gemini-api/docs/thinking-mode
10k limit order @75% for anyone feeling brave
@WillSorenson it is slightly short of exp 1206. Are you assuming a thinking 1206 will be added?
@Usaar33 It appears more pleasant than o1 to me so it makes it unlikely o1 will top the charts. The following all have to go right for OAI to win:
1. They have to release a new model today
2. It has to actually be better in the dimensions that chatbot arena evaluates
3. Chatbot arena has to update it in time.
Possible! Not more than a 20% chance.
@NeuralBets i would give it a 80/90% that OpenAI releases a new model as part of their 12 day of christmas but I am not sure they will make it available to LMSYS before end of the year - i am too deep at this point anyways so 🤷♂️
@Bayesian right now this position represents ~80% of my mana net worth but i am doubling down and put a large limit order at 40% on openai
@Soli it should be said that new model doesn’t mean that it will become N1. Reason 1: google may have fine tuned to perform way better on lmsys. Reason 2: google may have another fine tuned ready to answer any score release from OAI. Maybe google ceo and PM have their compensation tied to end-year perfomance on LMSYS
@mathvc true, openai released the new preview model over the api yesterday (which is still not ranked in LMSYS) and I expect another major announcement sooon so we shalll seee how it goes
@mathvc I think it's more a question of how often the leader board is updated.
I agree with your stance, I just don't know if I want more exposure to this market with my novice level of understanding of the subject.
@NoahRich i don’t think it is that interesting, if anything it shows google is out of the race for spot #1 this year. openai will pass them with the next minor update to 4o. they won’t even need to release a new model to pass google.
@Soli both could release another minor update in the time. there have even been reports shared here previously suggesting the potential lol
besides its interesting that google in its own has even reached this point. up until now they have been pretty far off. especially given their compute potential. interesting if they are finally starting to make use of their leg up in funding potential and compute
@NoahRich google reached this point already in july/august when they were ranked #1 for 1-2 weeks (see this other market that resolved yes) so imo no new information here that would be relevant for 2024 since there is a still a large enough gap between openai and google that can’t be closed this year. However for 2025 it is a different story and Google indeed might fully catch up instead of being 1-2 months behind.
@Soli Must've been right before I joined Manifold then! I joined in late August I think and at that time OpenAI was already leading. Thanks for sharing.