Resolution criteria
This market resolves to YES if, at any point before January 1, 2028, a large language model developed by a Chinese-headquartered organization achieves a strictly higher Elo rating than the highest-rated model developed by a US-headquartered organization on the official LMArena Overall Text Leaderboard. Otherwise, this market resolves to NO.
Primary Source of Truth: The Overall Text Arena Leaderboard hosted by LMArena (formerly LMSYS Chatbot Arena) at lmarena.ai or its official Hugging Face Space.
Corporate Headquarter Classification:
Chinese-headquartered developers include, but are not limited to: DeepSeek, Alibaba (Qwen), ByteDance (Doubao), Tencent (Hunyuan), Baidu (Ernie), Zhipu AI (GLM), 01.AI (Yi), and Moonshot AI (Kimi).
US-headquartered developers include, but are not limited to: OpenAI, Anthropic, Google, Meta, xAI, and Microsoft.
Evidence: Any official, public leaderboard update or verified internet archive snapshot (e.g., Wayback Machine) of the LMArena leaderboard showing a Chinese model outranking all US models by Elo score prior to the cutoff date will suffice for a YES resolution.
Fallback Rule: If LMArena becomes defunct, stops updating, or is otherwise unavailable before 2028, the market will resolve using the next most prominent independent benchmarking index (such as Artificial Analysis or the Stanford AI Index Report). If no objective public index is available, the creator will resolve the market based on consensus expert evaluation of top frontier models.
Background
The competitive gap between US and Chinese frontier artificial intelligence models has narrowed rapidly. Following the late-2024 and 2025 releases of highly efficient models like DeepSeek-R1 and DeepSeek-V3, as well as Alibaba's Qwen series, Chinese open-weights models have repeatedly challenged closed-source US giants like OpenAI, Anthropic, and Google on standard coding, math, and reasoning benchmarks.
As of mid-2026, US frontier labs (specifically Anthropic's Claude series and OpenAI's GPT series) continue to hold the absolute top spots on the LMArena leaderboard, with Chinese models like Alibaba's Qwen and DeepSeek's iterations positioned closely within the top tier. This market tracks whether a Chinese-developed model will officially overtake the top US competitor on the industry's premier crowd-sourced evaluation platform before the end of 2027.
This description was generated by AI. Review and verify everything here yourself. You can edit, replace, or delete any part of this description, including the resolution criteria. You do not need to trust the AI output.