Resolves to the exact same result as Kalshi's equivalent 2025 market if it exists, which uses lmsys chatbot leaderboard with other rules. Except if two models tie we will resolve them both 50 percent.
If not will use the ruleset of the 2024 market (linked below)
https://kalshi.com/markets/llm1/yearend-top-llm
Rules Summary (2024 version)
If OpenAI has the top-ranked LLM on Dec 31, 2024, then that market resolves to Yes. Outcome verified from LMSYS.
A tie would resolve to No.
Clarification 3/14/24 6:03 PM ET: The Contract's Underlying states that, "The Underlying for this Contract is the Arena Elo rankings of large language models on the LMSYS Chatbot Arena Leaderboard as checked at 10:00 AM ET daily after Issuance and before ." To be clear, this refers to the "rank" column of the leaderboard, not the Arena Elo Score proper. As of this update, the #1 spot is tied between two variations of GPT-4 and Claude 3 Opus. Moreover, the Payout Criterion states that, "If they [the target organization] have an LLM that is tied with another LLM, then the Payout Criterion is not fulfilled." To be clear, if the only tie is with the same organization, that organization's strike would resolve to Yes. So if on the target date the tie is between two variations of GPT-4, then the market would resolve Yes in favor of OpenAI. However, if it is tied between GPT-4 and Claude 3, then both OpenAI and Anthropic's strikes would resolve to No.
Deep Mind seems under priced but I don't want to shift the market too much myself.
Reasoning: They're best at RL, which is hard to do, and super useful when it works.
This market seems to be biased towards "the best AI will be an LLM, so who has good LLMs now?". But, you know, take my mana if you disagree