This market predicts whether either Claude 4 model (claude-sonnet-4, claude-opus-4) will attain the highest position on the EQ-Bench Creative Writing v3 leaderboard, the moment both of them are put on the site. EQ-Bench evaluates large language models on creative writing tasks, with rankings available at EQ-Bench Creative Writing v3 Leaderboard. The market resolves to 'Yes' if any Claude 4 model is listed as the top model on the leaderboard the time indicated, otherwise it resolves to 'No'.
If by July 1, 2025 either of the models aren't on the benchmark leaderboard, it resolves to 'No'.
🏅 Top traders
| # | Trader | Total profit |
|---|---|---|
| 1 | Ṁ9 | |
| 2 | Ṁ7 |
@genesi700 Please let us know if the reresolution to No is mistaken, like if Claude 4 was #1 on the leaderboard for some time but they retested it and it moved to #2 or something. Otherwise, according to the resolution criteria and the current ranking of Claude Opus 4 on the EQbench, the market resolves No