How long until one of Gemini, Claude, etc... match the capabilities of O1?
17
Ṁ2769
2026
7%
Oct 12th 2024
28%
Dec 12th 2024
50%
April 12th 2025
10%
September 12th 2025
3%
April 12th 2026
3%
Other

OpenAI's O1 model represents a new paradigm of LLMs. How long until a competitor catches up?

"Catches up" / "matches capabilities" is defined as matching or exceeding the O1 pass@1 benchmarks on AIME, Codeforces, and GPQA at the time of publication:

  • 83.3-percentile on AIME

  • 89-percentile on Codeforces

  • 78% accuracy on GPQA

Get Ṁ1,000 play money
Sort by:
bought Ṁ50 April 12th 2025 YES

Apparently o1's AIME score was pass@10000, not pass@1. Criteria should be updates accordingly