If Artificial General Intelligence has an okay outcome, what will be the reason?

575

22kṀ360k

2200

20%

J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.

10%

K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.

10%

Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)

M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)

C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.

O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)

E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.

B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.

I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.

H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.

D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.

A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.

N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.

G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.

An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.

This market is a duplicate of https://manifold.markets/IsaacKing/if-we-survive-general-artificial-in with different options. https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence-539844cd3ba1?r=RWxpZXplcll1ZGtvd3NreQ is this same question but with user-submitted answers.

(Please note: It's a known cognitive bias that you can make people assign more probability to one bucket over another, by unpacking one bucket into lots of subcategories, but not the other bucket, and asking people to assign probabilities to everything listed. This is the disjunctive dual of the Multiple Stage Fallacy, whereby you can unpack any outcome into a big list of supposedly necessary conjuncts that you ask people to assign probabilities to, and make the final outcome seem very improbable.

So: That famed fiction writer Eliezer Yudkowsky can rationalize at least 15 different stories (options 'A' through 'O') about how things could maybe possibly turn out okay; and that the option texts don't have enough room to list out all the reasons each story is unlikely; and that you get 15 different chances to be mistaken about how plausible each story sounds; does not mean that Reality will be terribly impressed with how disjunctive the okay outcome bucket has been made to sound. Reality need not actually allocate more total probability into all the okayness disjuncts listed, from out of all the disjunctive bad ends and intervening difficulties not detailed here.)

Update 2025-07-02 (PST) (AI summary of creator comment): Regarding answer option E: This option will not be the resolution if the okay outcome is achieved through careful alignment work. Such a scenario would resolve to a different answer.

Showcase

Get

1,000

to start trading!

People are also trading

If Artificial General Intelligence has a poor outcome, what will be the reason?

Will artificial general intelligence be achieved they the end of 2025 ?

7% chance

If Artificial General Intelligence (AGI) has an okay outcome, which of these tags will make up the reason?

The probability of extremely good AGI outcomes eg. rapid human flourishing will be >24% in next AI experts survey

59% chance

If we survive general artificial intelligence before 2100, what will be the reason?

If we survive general artificial intelligence, what will be the reason?

When artificial general intelligence (AGI) exists, what will be true?

Why will "If Artificial General Intelligence has an okay outcome, what will be the reason?" resolve N/A?

Will General Artificial Intelligence happen before 2035?

70% chance

Will we have an AGI as smart as a "generally educated human" by the end of 2025?

Sort by:

"does not mean that Reality will be terribly impressed with how disjunctive the okay outcome bucket has been made to sound"

Correct. It is even more disjunctive than that; and Reality will be very unimpressed with your arguments about how likely the doom outcome should be.

@EliezerYudkowsky I feel like this is underspecified. It's entirely possible that this outcome will come about through careful alignment work, but the question seems to suggest that it will happen by accident.

@Elspeth If it comes about through careful alignment work, then some other possibility happened, not possibility E.

@EliezerYudkowsky People will be arguing which one of these outcomes has occurred after supposed AGI is invented, which will be at a time when none of the outcomes has actually occurred because what we have is not exactly AGI and is not exactly aligned either. People will think wrongly that the critical window has passed and that the threat was overblown.

@Krantz want to exit our positions here? i put up a limit right next to market price. you should probably use that money on things that resolve sooner!

I appreciate the offer, but I feel pretty confident that this could resolve soon.

I'd feel more confident with this phrasing though:

Someone discovers a new paradigm in intelligence (collective, not artificial) that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much more alignable than giant inscrutable matrices of floating-point numbers.

What I'm hoping we build isn't actually AI, so I might lose for that technicality, but I am confident that it will be the reason for an outcome we might consider 'ok' and that's worth putting money on to me.

In general, I value the survival of my family and friends above winning any of these wagers.

The primary objective for wagering in these markets, for me, is to convey information to intelligent people that have an influence over research.

i think you should sell anyway!

I try not to do things unless I have reasons to do them.

Fun fact. If my system existed today, you could add the proposition 'Krantz should sell his position in K.' and the CI would then contrast and compare each of our reasons that support or deny that proposition, map them against each other and provide me with the optimal proposition (that exists on your ledger and not mine) that would be most effective in closing the inferential distance between our cruxes.

bought Ṁ1 YES

Isn't non-epistemic betting harmful to the reputability of prediction markets as a whole?

I bought E from 1% to 8% because maybe the CEV is natural — like, maybe the CEV is roughly hedonium (where "hedonium" is natural/simple and not related to quirks about homo sapiens) and a broad class of superintelligences would prioritize roughly hedonium. Maybe reflective paperclippers actually decide that qualia matter a ton and pursuing-hedonium is convergent. (This is mostly not decision-relevant.)

This seems likely to be a poorer predicter than usual - as you can't collect if you're dead

I am trying to do this differently here : https://manifold.markets/dionisos/if-we-survive-general-artificial-in-z3suausl60

bought Ṁ50 NO

Which options would be the most popular among ML researchers who aren't concerned about AI risk? From what I've read, the top choices would be "solving alignment is easy" (C or J) or "alignment isn't even a problem" (E?).

Yes, and G too I think.

@dionisos good point, G is probably the position of the "there's no such thing as intelligence" crowd

bought Ṁ1 YES

Two questions:

Why is this market suddenly insanely erratic this past week?
Why are so many semi-plausible sounding options being repeatedly bought down to ludicrously low odds <0.5% when you'd expect almost all to be within an order of magnitude of the base rate uniform distribution across options of ~6%?

opened a Ṁ25 NO at 24% order

Some of the options like H seem... logically possible yes, but a bit out there.

As for #2 that's why I bet up E a bit. I find it the most plausible contingent on a slightly superhuman AGI happening within the next 100 years, since it's the only real "it didn't work, but nothing that bad" happened option

The probilities are currently very easy to move.

this is way too high but it's a keynesian beauty contest

I thought this market was going to be resolved by Eliezer after AGI happens?

opened a Ṁ10,000 NO at 80% order

@Krantz take my orders on that option

@jacksonpolack Why did this go to 80% lmao

@benshindel

krantz has strong beliefs about it

This is verified.

Enough humans survive to rationalize whatever the outcome is as good, actually, and regardless of cost 90%+ of society treats anyone who points to visible alignment failure as a crazy person

People are also trading

If Artificial General Intelligence has a poor outcome, what will be the reason?

Will artificial general intelligence be achieved they the end of 2025 ?

7% chance

If Artificial General Intelligence (AGI) has an okay outcome, which of these tags will make up the reason?

The probability of extremely good AGI outcomes eg. rapid human flourishing will be >24% in next AI experts survey

59% chance

If we survive general artificial intelligence before 2100, what will be the reason?

If we survive general artificial intelligence, what will be the reason?

When artificial general intelligence (AGI) exists, what will be true?

Why will "If Artificial General Intelligence has an okay outcome, what will be the reason?" resolve N/A?

Will General Artificial Intelligence happen before 2035?

70% chance

Will we have an AGI as smart as a "generally educated human" by the end of 2025?

+4% 1d38% chance

People are also trading

People are also trading

Related questions