EOY 2025: Will open LLMs perform at least as well as 50 Elo below closed-source LLMs on coding?
15
1kṀ3810
2026
30%
chance

On December 31 2025, will the LMSys code arena's best closed-source LLM out-perform the best open-weights LLM by less than 50 points?

As of July 27, 2024 the gap is 58 ELO points.

If LMSys ceases to exist or to evaluate models, I will resolve to 50%.

If a model is open-weights but the LMSys eval uses an API e.g. deepseekv2-API this still qualifies as open-weights (unless I get evidence that the API version was different enough to affect this question; in such a case I would resolve to 50%).

Chart from https://x.com/maximelabonne/status/1779801605702836454 This shows all-question ELO whereas this market resolves by coding-only ELO, the trend is similar.

  • Update 2025-05-28 (PST) (AI summary of creator comment): The creator has indicated that the market title has been updated to provide further clarity on the resolution criteria. This action was taken in response to a user's question about how the market resolves, particularly in scenarios involving the ELO difference between open-source and closed-source models. Please refer to the updated market title for the most precise definition of the resolution condition.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy