Will OpenAI announce GPT-5 (or model better than O3) on July 17, 2025?
14
700Ṁ2808
resolved Jul 17
Resolved
N/A

Background

OpenAI o3 (released 16 Apr 2025) is the current “frontier” reasoning model. OpenAI will have a livestream on July 17, 2025 to release something unknown. Because OpenAI sometimes retires the “GPT-n” branding (e.g., the o-series), a strength-based fallback is needed in case the name “GPT-5” is not used.

Benchmarks for O3:

MMLU (5-shot): 86.9 %

GPQA (Diamond): 83.3 %

MMMU (0-shot, multimodal) 82.9 %

SWE-bench Verified: 69.1 %

ARC-AGI-Pub (high-compute): 88 %

Resolution Criteria

If either Condition A or Condition B is satisfied the Market Resolves to "YES".

Condition A — Naming test (simple)

YES if, during the day window, an official OpenAI communication (blog post, livestream, press release) clearly calls the newly-announced model “GPT-5”, “ChatGPT-5”, or “GPT 5”. Benchmarks will NOT be considered for condition A.

Condition B — Strength test (if a different name is chosen)

YES if all of the following are true:

  1. Announcement timing: The model is publicly announced on 17 July 2025.

  2. Benchmark disclosure: Within 7 days of the announcement, OpenAI publishes official scores (blog, system-card, or eval sheet) for the same protocol as the o3 numbers above on all five benchmarks shown in background.

  3. Performance threshold: For each of those benchmarks, the new model’s score is equal to or greater than the score of O3 shown above.

  4. If any one of the benchmarks listed above is omitted then I will select another reasonable benchmark to substitute for it. The aim of condition B is to evaluate whether the new model IF released is at least as intelligent as O3.

  • Update 2025-07-17 (PST) (AI summary of creator comment): In response to a question about how a potential 'ChatGPT agent' would be handled, the creator has indicated that such a product will be evaluated against Condition B. The determining factor will be whether its performance on the 5 listed benchmarks meets or exceeds O3's scores, not its specific classification as a 'model' versus an 'agent'.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy