
This market resolves to the year in which an AI system exists which is capable of passing a high quality, adversarial Turing test. It is used for the Big Clock on the manifold.markets/ai page.
The Turing test, originally called the imitation game by Alan Turing in 1950, is a test of a machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human.
For proposed testing criteria, refer to this Metaculus Question by Matthew Barnett, or the Longbets wager between Ray Kurzweil and Mitch Kapor.
As of market creation, Metaculus predicts there is an ~88% chance that an AI will pass the Longbets Turing test before 2030, with a median community prediction of July 2028.
Manifold's current prediction of the specific Longbets Turing test can be found here:
/dreev/will-ai-pass-the-turing-test-by-202
This question is intended to determine the Manifold community's median prediction, not just of the Longbets wager specifically but of any similiarly high-quality test.
Additional Context From Longbets:
One or more human judges interview computers and human foils using terminals (so that the judges won't be prejudiced against the computers for lacking a human appearance). The nature of the dialogue between the human judges and the candidates (i.e., the computers and the human foils) is similar to an online chat using instant messaging.
The computers as well as the human foils try to convince the human judges of their humanness. If the human judges are unable to reliably unmask the computers (as imposter humans) then the computer is considered to have demonstrated human-level intelligence.
Additional Context From Metaculus:
This question refers to a high quality subset of possible Turing tests that will, in theory, be extremely difficult for any AI to pass if the AI does not possess extensive knowledge of the world, mastery of natural language, common sense, a high level of skill at deception, and the ability to reason at least as well as humans do.
A Turing test is said to be "adversarial" if the human judges make a good-faith attempt, in the best of their abilities, to successfully unmask the AI as an impostor among the participants, and the human confederates make a good-faith attempt, in the best of their abilities, to demonstrate that they are humans. In other words, all of the human participants should be trying to ensure that the AI does not pass the test.
Note: These criteria are still in draft form, and may be updated to better match the spirit of the question. Your feedback is welcome in the comments.
People are also trading
Goal : improve the resolution criteria
Conflict of interest : I am "no" on this market (I think it will happen after 2050)
Why the Turing test is good in theory : If you can't think of any intellectual task the AI can't do (and human can), it seems hard to see why it isn't an AGI, and if you know one task it can't do, just ask the AI to do it or explain how it would to it.
(And it takes care of the part where it is an AGI, but you can tell it is an AI because you see it, and it is a computer or a robot.)
But it fails in practice in two ways :
False negative : Some ways to detect it is an AI have nothing to do with intellectual tasks, maybe the AI have some ethical constraints, maybe it has a style of writing, and both can be used to detect it.
I think it would be unconvincing for people thinking it is an AGI, if it can't pass the Turing test because it can't say some word or doesn't do any grammar mistake.
False positive : In theory you can ask for the category of task the limited AI can't do, but random people will probably not understand the limit of the AI, and will not think about these tasks.
It would be unconvincing for people thinking it is not an AGI, if it passes the Turing test, but it is still unable to do a good score in ARC-AGI or whatever tests like that
Proposition : Instead of using the Turing test, we can wait some time for people to find any intellectual task the AI can't do, and we can.
If they find it, it isn't an AGI, if after some time they don't, it is an AGI, and it resolves yes at the date of when this version of the AI was released.
It far from perfect, but I think it is more in the spirit of this market, what do you think ?
@dionisos I think this is right and @PhilosophyBear is also right and the way to have the best of both worlds is to require such a high quality Turing test that it doesn't have those false positives/negatives. (Nice articulation of those, btw, thank you.)
I'm worried that having "Turing test" in the title will, more and more, lead traders astray. This market was created two years ago, back when it was easy to make an AI fall on its face answering a single question. As Dwarkesh Patel eloquently put it a few months ago, sometimes goalpost moving is fair. Because you learn that the goalposts were wrong. I tend to think that a high enough quality Turing test will continue to work, but it might get to the point where grilling the AI has to include having it go out and perform real work on the real internet, coherently over hours or days, to prove that it's a truly general intelligence.
In short, I suspect plenty of traders are betting on EARLIER in this market because they predict the Longbets version of the Turing test, if set up faithfully as originally spec'd, will soon fall. But my interpretation of the market description is that we mean something more stringent than that.
I think this market might be less misleading if we simply removed "[High Quality Turing Test]" from the title and let the existing market description convey the nuance of what we actually mean by AGI. As the market description concludes, we're aiming at the spirit of the question for what people actually mean by AGI. E.g., definitely not the models we have as of early 2026 whether or not those models can be unmasked in a couple hours of text-only interaction.
@dreev Existing market description references longbets version of Turing test multiple times and does not have a single reference for "working for days". I don't know how it can be interpreted any differently
@MikhailDoroshenko I'm looking especially at the final note, "These criteria are still in draft form, and may be updated to better match the spirit of the question. Your feedback is welcome in the comments."
It also says this market isn't about the Lonbets Turing test specifically.
I guess it's high time we get this more pinned down.
Maybe a simpler way to put this: Suppose, hypothetically, that we went all out on running the highest possible quality Turing test and the AI passed, today. Which would be our reaction for this market?
Wow, apparently we hit AGI without all hell breaking loose.
Oops, the Turing test apparently doesn't work to distinguish AGI.
@MikhailDoroshenko I think I mostly agree, but, again, consider the final note in the market description about updating the resolution criteria to better match the spirit of the question. Still, the market description spends enough time on Turing tests specifically that it would feel unfair to throw that out completely.
Maybe what the market description implies is that we can go beyond the Longbets wager to a Turing test that's as stringent as necessary to match the spirit of the question, perhaps taking us closer to Aschenbrenner's drop-in remote worker. I think testing for that kind of capability is still in the scope of an expanded Turing test (as long as all the communication involved is text-based?).
How long can a market be in draft form before it needs to be locked in? With 1.1k unique traders, surely we all had something in mind when we participated.
For what it's worth, this market is also being used as the source of truth for another market: /elongatedmuskrat/will-we-burn-all-this-firewood-befo
What I had in mind is an AI capable of doing almost any intellectual task a human is able to do, at an ok level of competence (with enough time for both).
Even if the current AI are very competent, I think it still can’t because it lakes stability, and the ability to quickly learn.
Putting stuff in the context windows make the AI more knowledgeable about the current context, but also generally worse (and the training need way too much data to improve the AI compared to us).
And I find it quite plausible both are linked, if we manage to be stable over long periods, maybe it is because we are learning from the context we put ourselves in when accomplishing long tasks (by opposition to just having all the context in our mind, in fact our ability to have all the stuffs in our mind is probably completely shitty compared to what current AI can do)
@256 two years ago I predicted November 2026 with pretty high confidence. I think that's pretty much right. 90% it's within the range August–December 2026.

Does AI already have human-level intelligence? The evidence is clear
Nature article related to this question, argues for yes answer (02/02/2026)

However, just sold my shared because I think this market is fundamentally flawed. It seems based on previous comments that it requires someone
a) bothering to make the test
b) making a frontier LLM that is deeply misaligned
The question of AGI is completely separate from the latter. I think at this point we know that frontier models will be heavily constrained to not appear human, and so any answer regarding their Turing Test capability will be delayed by years purely because we'd have to wait for some scrappy-alternative misaligned models to fill the gap.
@strutheo I believe we got the clarification that if AGI doesn't happen by the end of 2048, we resolve this market to "2049". From the graph you can see that's how traders are treating it, as effectively "2049+".
@JoaoPedroSantos Don't think there are any actual tests planned. Probably wouldn't be smart or easy to make such a deceptive AI. Best chance to resolve any of these would be someone conceding the longbet without any Turing test having happened. Hell would break loose on prediction markets 😜
@Primer It certainly sounds like Longbets is committed to organizing a test at least in 2029. But it is correct that no one will ever train an AI to lie to the extent required to pass the test, which pretty much guarantees that by the letter, Kurzweil should lose.
But it could happen that one of them will concede, either Kurzweil for that reason, or Kapor if he believes that AIs are smart enough to do this (even if they would not actually do it due to being trained to be honest etc.)
I remember someone suggesting that Kapor would not be following the spirit of the bet if he claimed to win because the AI was not "dumb enough," i.e. because it was obviously smarter than a human and people could distinguish it that way. But I don't think that is true or that this is how Kapor and Kurzweil understood the bet, because Kapor's text specifically talks about how difficult it will be for an AI to precisely imitate a human. He knows it is not just about being smart. And this is equally true of Turing -- he knows that if you look at it as a test of intelligence, it is biased against the AI, because a lot more is required than intelligence. That is not an accident, it is intentional, in order to remove all possibility of doubt, if the AI can pass it.
I wonder what percentage "reliably" will be defined as. The Turing Test Committee might decide that they want more data than "2 out of 3 judges".