This is based on the inaugural longbets.org bet. I think Kapor will win and Kurzweil will lose, i.e., that a computer will not pass [what Kurzweil calls a valid] Turing test by 2029.
((Bayesian) Update: But I admit the probability has jumped up recently!)
Real-money version for anyone confident that Kurzweil's side has a good chance: https://biatob.com/p/11788533128982732233 (updated link with new odds offered)
Am I correct to assume that this market will resolve the same way as longbets bet it references?
I think this market is updating on this tweet https://twitter.com/sama/status/1590416386765254656 or on insider information. I find it surprising!
the last link doesnt work for me, can anyone tell me where i can bet real money on this?
I agree with Daniel Reeves comments, especially since he's the one judging this market. I also think manifold points will likely be worthless is this resolves to YES
All these clowns who’ve never interacted with anyone outside the cognitive elite will call the Turing test meaningless soon enough.
Of course it will pass—and could pass today. It’s extremely narrow AI with an abundantly clear objective function—does a person think it sounds human.
it’s about as difficult as tuning StabilityAI prompts—some version of these giant models with fine-tuning could do this given ~10-100k iterations of the imitation game to focus on chat and remove the less human sounding responses
(There’s a tad more to it than that, but not much. Not a good measure of intelligence. )
Kurzweil himself explicitly disagrees. The version of the Turing test Kurzweil and Kapor have agreed on (and which Kurzweil confirmed last month he's still on board with) is one where experts probe the AI for hours to determine if it's actually human-level, not just whether it sounds human in free-form chatting.
The people who thinks the Turing test is hard have literally never interacted with anyone outside some narrow, niche bubble.
This is the most narrow form of AI imaginable and trivial for ~85 IQ today, doable for average in due time, and probably much harder for the “145 IQ” version.
Don’t confuse impersonation of your social circle with the average human.
Typical humans cannot convincingly convey a cohesive fictional back story when probed by experts for hours. See also: espionage. A machine able to pass this "strict" test would have to be much more intelligent than humans at this task.
I listened to Kurzweil talk about the Turing test a bit on a recent podcast -- https://www.youtube.com/watch?v=ykY69lSpDdo&t=66s -- and he's clear that he's talking about a version where an expert grills the AI for as long as it takes. Ie, this question is a proxy for "will AGI happen by 2029?". I think 50% is still much too high for this market.
@dreev I see the market still thinks AGI by 2029 is likely. I think the market is wrong but not sure how much more mana I want to pour in. My meta prediction is that the market probability will keep climbing as new AI capabilities are hyped, before finally dropping as 2029 approaches and Kurzweil admits that we're still not there (as he very clearly admits currently).
@MartinRandall Yeah, or a possibly more general way to put that is that it only makes sense to bet YES on this if you think we'll get AGI that somehow doesn't make mana worthless (for better or worse).
I think AGI by 2029 is honestly quite a bit below 50% probability, but AGI by 2029 and a world where it matters that you won mana betting YES here? That's even lower probability.
@Gigacasting I just reread the rules at longbets.org/1 and I think the biggest question mark is whether Kurzweil and Kapor will agree on choosing experts as the judges. (And perhaps also whether they agree on choosing articulate, conscientious human foils.)
If so -- and reading Kurzweil's wild sci-fi reasons he expected to win the bet, I think that would be fair -- then you've got experts grilling the AI for essentially as long as they need and that really requires AGI for them to be fooled. Which is what the spirit of the bet was about.
It used to feel obvious to me that we were nowhere near getting computers to pass the Turing test because it was trivial to make up a single common-sense question like "what's bigger, your mom or a french fry?" and the computer would immediately fall on its face, with no hope of actually understanding what was being asked. That changed in the last year or so. Now large language models genuinely understand questions like that. (At least they're getting close to consistently answering them impeccably and I don't know how else to define "genuinely understand".)
But it's still easy to unmask the AI with a handful of follow-up questions. The leap we made recently is mind-boggling but even bigger leaps are still needed before we'll pass the Kurzweil/Kapor version of the Turing Test.
Using experts in ML/AI as judges and 145+ IQ “foils” make it somewhat trickier, but don’t change the fact this does not require “AGI”
It’s a parlor game for which the best thing to do is simply gather vast amounts of data on what people judge as “human sounding”; not only could a machine win within a year (with $1M budget for mechanical turks) but it measures zero higher-primate abilities such as long term planning, emotional states, etc.
Everyone will soon agree this was a dumb test, just as they are “not impressed” that GPT-3 makes Joe Biden look like a lower-Iq-intelligence bird, or that Dall-e has created almost all of the best art made in 2022.
A better test for useful AGI are Rodney Brooks’ househouse servant and hospital-architect tests, and a much simpler one is beat a human in a de novo game made up on the fly. (Ie zero shot tasks a human can adapt to on the fly, not things you can apply supervised learning to, which the Turing test trivially is.)
Only a small fraction of people can handle basic reasoning; I’d encourage you to test made-up games with some villagers in Rwanda and see how it goes.
The test was an offhand thought experiment and a horrendous way to test nearly any abilities that matter. Speech, vision, and 100-IQ human language are beyond solved.
When it has the memory or emotion of a pet, or the reasoning and coordination to “make a cup of coffee” in an arbitrary room, that’s a good sign of human-ish intelligence.
Wordcel-ing a couple extra levels is already solved and useless.