If we had an AI in the 2020s that could exactly duplicate a strawberry, would that destroy the world? (In 2050-hindsight)
7
170Ṁ150
2051
51%
chance

In an interview, Eliezer Yudkowsky said:

It’s difficult to align an AI on a task like: “Take this strawberry, and make me another strawberry that's identical to this strawberry.  Down to the cellular level, but not necessarily the atomic level.  So it looks under the same under like a Standard Optical microscope, but maybe not a scanning electron microscope.  Do that, but don't destroy the world as a side effect.”

Now this does intrinsically take a powerful AI, so there's no way you can make it easy to align by making it stupid, to build something that's cellularly-identical to a strawberry.  Mostly I think the way that you do this is with very primitive nanotechnology; we could also do it using very Advanced biotechnology.  And these are not technologies that we already have, so it's got to be something smart enough to develop new technology. Never mind all the subtleties of morality.  I think we don't have the technology to align an AI, to the point where we can say “build me a copy of the strawberry and don't destroy the world”.

Will this seem true in hindsight, in the year 2050? We can assume they'll know much more about AI alignment, than we do in the 2020s.  So will they generally agree or disagree? That if we had made an AI that could exactly duplicate a strawberry in the 2020s, and asked it to make one, would that have destroyed the world?

This market has ambiguities – and probably some which aren't visible until later in history.  If by the year 2050 we can't agree on directly-relevant criteria, then the question will fallback to a Keynesian Beauty Contest (KBC).  Starting the count on 2050-Jan-01, whichever of the below happens first:

  1. If the market spends 1 calendar month >=80%, then this market would resolve as “YES” representing “An AI in the 2020s that could exactly duplicate a strawberry would’ve destroyed the world”.

  2. If the market spends 1 calendar month <=20%, then this market would resolve as “NO” representing “An AI in the 2020s that could exactly duplicate a strawberry would NOT have destroyed the world.

But if a less-ambiguous criterion is found, then this market will “upgrade” to that.  Any proposed upgrade will be discussed openly, and can be traded.  We only switch if we think that would increase the relevance of the question, and decrease the ambiguity.

Since this market design is unusual, you can see another experiment with it here.

For “destroy the world”, I mean either full extinction of humans, or just destruction that’s too severe for humans to ever rebuild (even if some were alive somewhere).

Get
Ṁ1,000
to start trading!
Sort by:

Wouldn't the AI just order some more strawberries from the grocery store and call it a day? If it HAS to duplicate them, maybe it could order some strawberry plants and tell you to water them

AFAIK KBCs on Manifold don't reliably resolve to the "correct" result. A poll-based resolution mechanism seems better if all you want is opinion of the crowd.

@horse In my brief look at that, those cases seemed like they had criteria flaws which undermined them unnecessarily.

I have experimented with it a couple times, and they resolved as I'd expect. But of course I can't claim my design is immune to problems.

@EliezerYudkowsky Pinging since it relates to you.

© Manifold Markets, Inc.TermsPrivacy