Will a robot be created that is capable of passing Steve Wozniak's "The Coffee Test" before 2040?

Steve Wozniak, the co-founder of Apple Inc., proposed "The Coffee Test" as a benchmark for evaluating the capabilities of autonomous robots. The test requires a robot to enter an unfamiliar house, find the kitchen, identify the necessary tools and ingredients, and then prepare a cup of coffee. The Coffee Test challenges a robot's ability to navigate unknown environments, recognize objects, manipulate tools and materials, and follow a sequence of tasks to achieve a specific goal. While there have been significant advancements in robotics, no robot has yet passed The Coffee Test.

Will a robot be created that is capable of passing Steve Wozniak's "The Coffee Test" within a specific time frame before January 1st, 2040?

Resolution criteria:

This question will resolve to "YES" if, before January 1st, 2040, a robot is publicly and credibly documented to have:

  1. Successfully entered an unfamiliar residential environment, located the kitchen, and autonomously navigated the space, including:

    a. Identifying and avoiding obstacles.
    b. Adapting to different lighting conditions and surfaces.

  2. Demonstrated the ability to identify and manipulate various kitchen tools, appliances, and ingredients, such as:

    a. Recognizing coffee makers or machines, coffee filters, coffee grinders, and kettles.
    b. Identifying coffee beans or grounds, water sources, and optional items like sugar, milk, or creamer.
    c. Operating appliances and tools, such as turning on the coffee maker, grinding coffee beans, and pouring water.

  3. Exhibited the capability to follow a sequence of tasks to prepare a cup of coffee, including:

    a. Retrieving and preparing the necessary tools, appliances, and ingredients.
    b. Following a logical order of steps to make the coffee.
    c. Adjusting to variations in coffee-making equipment or processes based on the available tools and appliances.

  4. Successfully completed The Coffee Test, resulting in a properly prepared cup of coffee, within a specific time frame not exceeding 20 minutes, which is comparable to an average human performing the same task.

A successful demonstration must be accompanied by:

  1. A publicly accessible report or documentation describing the robot's design, capabilities, and performance during The Coffee Test, including the time taken to complete the test.

  2. Reporting of the findings in one or more peer-reviewed scientific journals or relevant media outlets.

I will use my discretion when resolving this question, possibly in consultation with experts.

Get Ṁ600 play money
Sort by:
bought Ṁ25 YES

Similar question with a different timescale:


This isn't a complete solution for the task, but it's definitely a step in the right direction.

(check the video in the tweet) https://twitter.com/_akhaliq/status/1724278737531801931

bought Ṁ100 YES from 87% to 88%
predicts NO

@MatthewBarnett If the robot is demonstrated to do this while hooked up to some AI system running somewhere else, does this still resolve yes?

Meta comment not limited to this specific question/platform: I'd prefer a phrasing more along the lines of "Will it be demonstrated that [robot/AI/...] passes/achieves/... X?" in the question and title. It would more closely reflect the resolution criteria.

To illustrate my reasoning:

Will I be capable of swimming 1000m in 2024?

Will it be demonstrated that I swim 1000m in 2024?

Those will have very different odds.

Your questions get referenced quite often (rightfully so, because they are awesome!) and chances are they will appear in more and more mainstream media articles, written by authors never having seen a prediction market and never having read resolution criteria. The exact wording is important and in my opinion it is worth to sacrifice some readability to gain some precision.

bought Ṁ180 of YES

@Gigacasting that is literally the kitchen of the AI developers, not some unknown kitchen, lol.

bought Ṁ55 of YES

@ChristopherKing I don't think they were saying this is related to the resolution criteria, rather evidence of the current "embodiment" progress of AI research. And given how LLMs, image generators, etc. have advanced exponentially over a small period of years, plus the fact that we could easily go through the end of scaling + an AI winter + a lot of progress in the next AI paradigm before 2040 anyways, yes seems correct here.

Comment hidden

More related questions