Which LLM Maker will have the largest Emotional Intelligence Benchmark Elo Score on https://eqbench.com on Jan 1, 2027?
1
175Ṁ109
2027
44%
Moonshot AI (E.g. Kimi K series)
32%
OpenAI (E.g. ChatGPT series)
5%
Google (E.g. Gemini series)
5%
Antropic (E.g. Claude series)
5%
Nous Research (E.g. Hermes series)
5%
X.AI (E.g. Grok Series)
5%
Other

Resolution criteria

  • How the winner will be selected:

  • The market resolves to the LLM maker with the highest Elo score on the EQ-Bench 3 leaderboard at https://eqbench.com on January 1, 2027. Resolution is determined by navigating to the leaderboard, identifying the top-ranked model by Elo score, and determining its maker.

  • If the top Elo score belongs to a model from a maker not listed in the provided answer options (i.e., a new company or unlisted maker), the market resolves to "Other."

Background

  • EQ-Bench 3's Elo score is calculated from pair-wise model comparisons, where an LLM judge rates each response against eight core dimensions of emotional intelligence. The test set contains 45 scenarios spanning 3 turns, where the user messages set up the scenario and inject conflict, while the evaluated model must reply in-character with introspection blocks exposing reasoning and theory-of-mind understanding. EQ-Bench 3 is a subjective evaluation judged by an LLM (Sonnet 3.7), so results should be considered roughly indicative but not absolute truth.

Considerations

  • Elo scores are relative and shift around when new models are added, meaning the leaderboard composition and rankings can change significantly as new models are evaluated. Additionally, the benchmark's subjective nature means results should be treated as indicative rather than definitive, and performance can vary based on the specific judge model and evaluation methodology used.

Market context
Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy