dreev avatar
closes Jan 2, 2029
Will AI pass the Turing test by 2029 Jan 1?
70%
chance

This is based on the inaugural longbets.org bet. I think Kapor will win and Kurzweil will lose, i.e., that a computer will not pass [what Kurzweil calls a valid] Turing test by 2029.

((Bayesian) Update: But I admit the probability has jumped up recently!)

See also https://www.metaculus.com/questions/3648/longbets-series-by-2029-will-a-computer-have-passed-the-turing-test/

Real-money version for anyone confident that Kurzweil's side has a good chance: https://biatob.com/p/11788533128982732233 (updated link with new odds offered)

Sort by:
Gigacasting avatar
Gigacasting
is predicting YES at 69%
MikhailDoroshenko avatar
Mikhail Doroshenko
bought Ṁ47 of YES

Am I correct to assume that this market will resolve the same way as longbets bet it references?

dreev avatar
Daniel Reeves
is predicting NO at 62%

@MikhailDoroshenko Affirmative

Lorenzo avatar
Lorenzo
is predicting NO at 70%

I think this market is updating on this tweet https://twitter.com/sama/status/1590416386765254656 or on insider information. I find it surprising!

JohannMuhlbach avatar
jojomonsta7777
bought Ṁ14 of NO

the last link doesnt work for me, can anyone tell me where i can bet real money on this?

dreev avatar

@JohannMuhlbach Thanks for catching that! The bet expired so I made a new one and replaced the link. Should work now (for another few months before it expires again).

JohannMuhlbach avatar
jojomonsta7777
is predicting NO at 49%

@dreev thank you, i signed up but then i realized i of course want to bet against kurzweil xD

Lorenzo avatar
Lorenzo
bought Ṁ400 of NO

I agree with Daniel Reeves comments, especially since he's the one judging this market. I also think manifold points will likely be worthless is this resolves to YES

L avatar
L

@Lorenzo why would it be worthless? wouldn't there simply be a lot of ai traders on it?

IsaacKing avatar
Isaac King
is predicting YES at 67%

@L Presumably Lorenzo believes we'll all be dead.

Lorenzo avatar
Lorenzo
is predicting NO at 63%

@IsaacKing I just think I will spend much less time on manifold

Gigacasting avatar
Gigacasting
sold Ṁ6 of NO

All these clowns who’ve never interacted with anyone outside the cognitive elite will call the Turing test meaningless soon enough.

Of course it will pass—and could pass today. It’s extremely narrow AI with an abundantly clear objective function—does a person think it sounds human.

it’s about as difficult as tuning StabilityAI prompts—some version of these giant models with fine-tuning could do this given ~10-100k iterations of the imitation game to focus on chat and remove the less human sounding responses

(There’s a tad more to it than that, but not much. Not a good measure of intelligence. )

dreev avatar

Kurzweil himself explicitly disagrees. The version of the Turing test Kurzweil and Kapor have agreed on (and which Kurzweil confirmed last month he's still on board with) is one where experts probe the AI for hours to determine if it's actually human-level, not just whether it sounds human in free-form chatting.

Gigacasting avatar
Gigacasting
is predicting YES at 45%

The people who thinks the Turing test is hard have literally never interacted with anyone outside some narrow, niche bubble.

This is the most narrow form of AI imaginable and trivial for ~85 IQ today, doable for average in due time, and probably much harder for the “145 IQ” version.

Don’t confuse impersonation of your social circle with the average human.

MartinRandall avatar
Martin Randall
is predicting YES at 45%

Typical humans cannot convincingly convey a cohesive fictional back story when probed by experts for hours. See also: espionage. A machine able to pass this "strict" test would have to be much more intelligent than humans at this task.

GeorgeVii avatar
GeorgeVii
is predicting YES at 45%

I am reminded of this survey of sveral UK MPs:

https://www.bbc.co.uk/news/uk-19801666

(I'm sure there was an amusing video along with it, that I now fail to find after quick look.)

BTE avatar

@Gigacasting This is one of your best takes on here.

dreev avatar

I listened to Kurzweil talk about the Turing test a bit on a recent podcast -- https://www.youtube.com/watch?v=ykY69lSpDdo&t=66s -- and he's clear that he's talking about a version where an expert grills the AI for as long as it takes. Ie, this question is a proxy for "will AGI happen by 2029?". I think 50% is still much too high for this market.

dreev avatar

@dreev I see the market still thinks AGI by 2029 is likely. I think the market is wrong but not sure how much more mana I want to pour in. My meta prediction is that the market probability will keep climbing as new AI capabilities are hyped, before finally dropping as 2029 approaches and Kurzweil admits that we're still not there (as he very clearly admits currently).

MartinRandall avatar

@dreev I think the big question is how large the probability space is where AGI is created and we are willing to let a random human talk to it/them for arbitrary lengths of time, but it does not kill Kurzweil and all humans.

dreev avatar

@MartinRandall Yeah, or a possibly more general way to put that is that it only makes sense to bet YES on this if you think we'll get AGI that somehow doesn't make mana worthless (for better or worse).

I think AGI by 2029 is honestly quite a bit below 50% probability, but AGI by 2029 and a world where it matters that you won mana betting YES here? That's even lower probability.

Gigacasting avatar
Gigacasting
bought Ṁ3 of YES
PaLM understands jokes and achieves better benchmark performance than an average person. It's almost trivial to produce a model optimized to "pass the Turing test" by paying people to rate the odds they're chatting with a human, and fine-tuning the model based on those ratings. Even without 10x/yr model scaling, a five figure budget on humanness ratings would achieve this today.
Gigacasting avatar
Up 70% on metaculus since this comment 😉
Gigacasting avatar
Gigacasting
bought Ṁ50 of YES
(Arbitrage: Metaculus at 65%)
MartinRandall avatar
@Gigacasting The two hour interview sessions make it more expensive, no?
dreev avatar

@Gigacasting I just reread the rules at longbets.org/1 and I think the biggest question mark is whether Kurzweil and Kapor will agree on choosing experts as the judges. (And perhaps also whether they agree on choosing articulate, conscientious human foils.)

If so -- and reading Kurzweil's wild sci-fi reasons he expected to win the bet, I think that would be fair -- then you've got experts grilling the AI for essentially as long as they need and that really requires AGI for them to be fooled. Which is what the spirit of the bet was about.

It used to feel obvious to me that we were nowhere near getting computers to pass the Turing test because it was trivial to make up a single common-sense question like "what's bigger, your mom or a french fry?" and the computer would immediately fall on its face, with no hope of actually understanding what was being asked. That changed in the last year or so. Now large language models genuinely understand questions like that. (At least they're getting close to consistently answering them impeccably and I don't know how else to define "genuinely understand".)

But it's still easy to unmask the AI with a handful of follow-up questions. The leap we made recently is mind-boggling but even bigger leaps are still needed before we'll pass the Kurzweil/Kapor version of the Turing Test.

Gigacasting avatar

Using experts in ML/AI as judges and 145+ IQ “foils” make it somewhat trickier, but don’t change the fact this does not require “AGI”

It’s a parlor game for which the best thing to do is simply gather vast amounts of data on what people judge as “human sounding”; not only could a machine win within a year (with $1M budget for mechanical turks) but it measures zero higher-primate abilities such as long term planning, emotional states, etc.

Gigacasting avatar

Everyone will soon agree this was a dumb test, just as they are “not impressed” that GPT-3 makes Joe Biden look like a lower-Iq-intelligence bird, or that Dall-e has created almost all of the best art made in 2022.

Gigacasting avatar

A better test for useful AGI are Rodney Brooks’ househouse servant and hospital-architect tests, and a much simpler one is beat a human in a de novo game made up on the fly. (Ie zero shot tasks a human can adapt to on the fly, not things you can apply supervised learning to, which the Turing test trivially is.)

dreev avatar
Daniel Reeves
bought Ṁ599 of NO

@Gigacasting Can't you do the do novo game test via text? That sounds like a beautiful example of how an expert judge in the Turing test can test for AGI.

Gigacasting avatar

https://www.unz.com/akarlin/stupid-people/

Only a small fraction of people can handle basic reasoning; I’d encourage you to test made-up games with some villagers in Rwanda and see how it goes.

The test was an offhand thought experiment and a horrendous way to test nearly any abilities that matter. Speech, vision, and 100-IQ human language are beyond solved.

When it has the memory or emotion of a pet, or the reasoning and coordination to “make a cup of coffee” in an arbitrary room, that’s a good sign of human-ish intelligence.

Wordcel-ing a couple extra levels is already solved and useless.

dreev avatar
Daniel Reeves
is predicting NO at 44%
Well, my probability has gone up a bit in light of the latest Large Language Models. See https://manifold.markets/dreev/will-googles-large-language-model-p
dreev avatar
Daniel Reeves
bought Ṁ20 of NO
My true probability here is <5% and pretty much all of that probability mass is on the test being conducted wrong, like the human foils being uncooperative.
dreev avatar
Daniel Reeves
bought Ṁ1 of NO
I accidentally had this closing early. Fixed now.
MatthewBarnett avatar
Matthew Barnett
bought Ṁ1 of YES
I duplicated this question here: https://manifold.markets/MatthewBarnett/will-ray-kurzweil-win-his-2029-turi It's scheduled to close later.
AngolaMaldives avatar
Angola Maldives
bought Ṁ1 of NO
Uh, resolving in 3 days?
NathanBraun avatar
Nathan Braun
bought Ṁ1 of YES
Betting Yes at 15% based on the Metaculus's community estimate of 40%.