The Agent Company is a benchmark for measuring progress on automated remote workers that's been getting a lot of press. Mostly mocking how poorly AI performed. Which is the point of this market: if you think this research suggests AI is "not coming for your job anytime soon" then bet this down.
The benchmark involves completing contrived tasks meant to simulate running a company. The best score so far is Claude at 24% (I'm guessing GPT-o3 will do better).
This market resolves-to-PROB at whatever score the best AI achieves by market close. If the benchmark is saturated, we'll resolve early to 100% (YES). Note that this market can't resolve NO but it can theoretically resolve as low as 24%.
🏅 Top traders
| # | Trader | Total profit |
|---|---|---|
| 1 | Ṁ7 | |
| 2 | Ṁ3 | |
| 3 | Ṁ0 |
People are also trading
@dreev practical solution but wondering whether you’d endorse 44% as your best guess? (Not that it matters for the resolution, just curious.)
@Popsicle2338 Well, I don't have a better guess. I can be easily swayed by arguments if anyone has any.