The Agent Company is a benchmark for measuring progress on automated remote workers that's been getting a lot of press. Mostly mocking how poorly AI performed. Which is the point of this market: if you think this research suggests AI is "not coming for your job anytime soon" then bet this down.
The benchmark involves completing contrived tasks meant to simulate running a company. The best score so far is Claude at 24% (I'm guessing GPT-o3 will do better).
This market resolves-to-PROB at whatever score the best AI achieves by market close. If the benchmark is saturated, we'll resolve early to 100% (YES). Note that this market can't resolve NO but it can theoretically resolve as low as 24%.
Update 2026-05-21 (PST) (AI summary of creator comment): The creator is considering resolving to 44% as a best guess, since the benchmark does not appear to be actively maintained with new results. The creator is open to arguments for a different value before resolving.
People are also trading
@dreev practical solution but wondering whether you’d endorse 44% as your best guess? (Not that it matters for the resolution, just curious.)
@Popsicle2338 Well, I don't have a better guess. I can be easily swayed by arguments if anyone has any.