(Metaculites created this question with resolution date 2040.)
"A team of three expert interviewers will interact with a candidate machine system (MS) and three humans (3H). The humans will be graduate students in each of physics, mathematics and computer science from one of the top 25 research universities (per some recognized list), chosen independently of the interviewers. The interviewers will electronically communicate (via text, image, spoken word, or other means) an identical series of exam questions of their choosing over a period of two hours to the MS and 3H, designed to advantage the 3H. Both MS and 3H have full access to the internet, but no party is allowed to consult additional humans, and we assume the MS is not an internet-accessible resource. The exam will be scored blindly by a disinterested third party." The experts may come up with new questions to ask while administering the test.
If such a test is passed before 2028, then this resolves Yes. If such a test is conducted on a state-of-the-art AI in 2027, and the AI fails, then this resolves No. If neither criterion is met, then this will resolve to my credence that such a test could be passed by an existing AI system. I will not participate in this market.
EDIT: No other AI systems should be consulted. Systems which use an AI instrumentally (e.g. as in Google search results) are ok, but the test adminstrator should do their best to redact direct AI content e.g. the AI-generated QA panels at the top of certain Google queries.
As stated it's left open whether AI are allowed to be consulted by both sides. If they were, then this ends up being a question on the gap between the best and second best available AI system at times of testing.
I propose adding a clause that no other AI systems should be consulted. Systems which use an AI instrumentally (e.g. as in Google search results) are ok, but the test adminstrator should do their best to redact direct AI content e.g. the AI-generated QA panels at the top of certain Google queries.
If no one objects within a week, I will add this to the question text. I'm very open to debate here, since I think this is a significant ambiguity in the resolution criterion as stated.
@Toby96 This is simply a clone of a pre-existing Metaculus-bot question/market. The market was resolved N/A on Manifold because it was deleted or otherwise unfindable on Metaculus.
You're right that it's a fairly narrow area of intelligence, but if it were expanded I don't think your suggestion is very good since that is more of a test of robotics/sensory technology advancements than AI advancements. Other expansions (ideally in a different market) could include doing economic tasks, expanding the relevant knowledge domain to areas in a larger slice of STEM or outside of STEM fields, or other things that don't require robotics or other non-AI technical advances.