When in 2026 will OS-World Verified be saturated by AI models?
2
1kṀ100
2027
June 30, 2026
20%
Jan-Mar 2026
29%
Apr-Jun 2026
31%
Jul-Sep 2026
20%
Oct-Dec 2026

#### Resolution criteria

Saturation on OS-World Verified means a model can execute simple, realistic tasks in Linux-based environments using popular open-source applications, such as adding page numbers to a document or exporting a CSV file from a spreadsheet. The market resolves YES when the highest-performing model on the official OS-World Verified leaderboard reaches 95% success rate or higher. Resolution will be determined by checking the official OS-World leaderboard at the time of resolution. If the benchmark is substantially modified or discontinued, the market resolves N/A.

Background

OS-World was originally developed by researchers from the XLANG Lab at the University of Hong Kong and released in April 2024, with a major update dubbed OS-World Verified in July 2025. With human performance estimated at ~72%, the best current systems have reached 84.4% of human capability. OS-World remains far from saturated with substantial headroom to human-level performance.

Considerations

Task difficulty changes over time in unpredictable ways, many tasks can be completed without using much or any GUI interactions, and the skill of interpreting the instruction is sometimes as important as the skill of using the computer. Uncontrollable factors include anti-crawling mechanisms and CAPTCHAs on websites that can cause the benchmark's signal to gradually weaken over time.

Get
Ṁ1,000
to start trading!
© Manifold Markets, Inc.TermsPrivacy