
Resolution Criteria:
If all of the following criteria are met, this question resolves to YES. Otherwise it resolves to NO.
1. Showcase Event: OpenAI must publicly showcase an AI assistant between January 1st, 2024, and January 1st, 2026. The AI assistant must demonstrate the ability to control a virtual desktop or browser environment.
2. Task Performance: The AI assistant must perform a series of routine white-collar job tasks which are specified in advance and observable by the public during the showcase. The tasks must be completed with minimal human correction, defined as less than 15% of tasks requiring human intervention in the task completion process, across tasks shown in the demo.
People are also trading
@mods Creator last active months ago. This should resolve to Yes according to https://www.youtube.com/watch?v=8UWKxJbjriY.
@CharlesFoster
Similar to with Operator, the ChatGPT Atlas demo focused on personal stuff:
Web search / web history search
Looking up stuff about movies
Planning a haunted house
Help with recipes and grocery shopping
@CharlesFoster The only stuff I would consider maybe "routine white-collar job tasks" were reviewing a GitHub pull request and reviewing an email draft.
@CharlesFoster The "planning a haunted house" was to showcase task distribution through a google doc. That's pretty much "routine white-collar job" for me.
Few would deny that "web search / web history search" also happens in office.
I think the Operator demo qualifies for the Showcase Event criterion, but not for the Task Performance criterion, mostly because it focused on personal assistant tasks rather than white-collar job tasks.
https://www.youtube.com/watch?v=CSE77wAdDLg