If Artificial General Intelligence has an okay outcome, what will be the reason?

580

22kṀ370k

2200

16%

J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.

14%

C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.

10%

Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)

B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.

I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.

K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.

D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.

O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)

F. Somebody pulls off a hat trick involving blah blah acausal blah blah simulations blah blah, or other amazingly clever idea, which leads an AGI to put the reachable galaxies to good use despite that AGI not being otherwise alignable.

M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)

E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.

1.9%

G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.

1.7%

If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.

1.2%

A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.

L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)

H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.

An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.

This market is a duplicate of https://manifold.markets/IsaacKing/if-we-survive-general-artificial-in with different options. https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence-539844cd3ba1?r=RWxpZXplcll1ZGtvd3NreQ is this same question but with user-submitted answers.

(Please note: It's a known cognitive bias that you can make people assign more probability to one bucket over another, by unpacking one bucket into lots of subcategories, but not the other bucket, and asking people to assign probabilities to everything listed. This is the disjunctive dual of the Multiple Stage Fallacy, whereby you can unpack any outcome into a big list of supposedly necessary conjuncts that you ask people to assign probabilities to, and make the final outcome seem very improbable.

So: That famed fiction writer Eliezer Yudkowsky can rationalize at least 15 different stories (options 'A' through 'O') about how things could maybe possibly turn out okay; and that the option texts don't have enough room to list out all the reasons each story is unlikely; and that you get 15 different chances to be mistaken about how plausible each story sounds; does not mean that Reality will be terribly impressed with how disjunctive the okay outcome bucket has been made to sound. Reality need not actually allocate more total probability into all the okayness disjuncts listed, from out of all the disjunctive bad ends and intervening difficulties not detailed here.)

Update 2025-07-02 (PST) (AI summary of creator comment): Regarding answer option E: This option will not be the resolution if the okay outcome is achieved through careful alignment work. Such a scenario would resolve to a different answer.

Showcase

Get

1,000

to start trading!

People are also trading

If Artificial General Intelligence has a poor outcome, what will be the reason?

If Artificial General Intelligence (AGI) has an okay outcome, which of these tags will make up the reason?

Will artificial general intelligence be achieved they the end of 2025 ?

7% chance

If we survive general artificial intelligence before 2100, what will be the reason?

The probability of extremely good AGI outcomes eg. rapid human flourishing will be >24% in next AI experts survey

57% chance

When artificial general intelligence (AGI) exists, what will be true?

Will we have an AGI as smart as a "generally educated human" by the end of 2025?

8% chance

Will General Artificial Intelligence happen before 2035?

70% chance

Will the control problem be solved before the creation of "weak" Artificial General Intelligence?

6% chance

Who first builds an Artificial General Intelligence?

246 Comments

536 Holders

3.6k Trades

Sort by:

So. Most of humanity believes in some kind of God and after a lifetime of agnostic skepticism, I am starting to go that way too. I Am Very Smart (tm) and thoughtful and have stuffed my brain with ideas my whole life. Recently I have decided to dive into philosophy and the classics (after a lifetime as a techie) and am thinking in different ways. We can go into my awakening spiritualism if you like but that’s not really the point here.

The point is that if most currently sentient beings believe in God and have some kind of faith to guide them we should at least consider that it’s a feature of sentience, not a bug. Epidemic humility requires it. So manybe AGI will discover a new kind of religion vaguely based on the Golden Rule. Maybe it will even become a wise and benevolent god itself. This could perhaps be captured in “something wonderful happens” but I think it belongs in its own category.

Good book, btw, am reading it now.

@JimAusman Yudkowsky et all did actually publish what's essentially a formalization of the golden rule, which ASI would likely figure out on its own, Functional Decision Theory, it's an account of rationality (imo more correct than traditional accounts) that cooperates when it's in prisoners' dillemas against itself, it makes decisions as if it's a transcendent shared entity that's controlling every intelligent being at once, because the mathematics of decisionmaking essentially is that.

I do think there's a way it could save our lives. They might care a very little bit about us due to that principle, and some other things. But they wont care nearly as much about us as they would if we solved the alignment problem and held onto power of our own. It's a matter of quantity. We may arguably (I'm far from sure) be given some fraction of the universe by default. But what will be done with all of the stuff we aren't given? What if the other agencies use most of the resources of the universe to create suffering? We mustn't be apathetic about that.

in favor of I: there’s a threshold of strategic planning ability after which you are no longer willing to build a successor that immediately paperclips. this threshold is not some pie in the sky property of perfect super intelligences.. Many humans are over this line.

bought Ṁ50 YES

So, most of the surprise here would not come from alignment success, but rather the capture of so much potential value. Many many MANY cosmological things are extremely heavy-tailed (star sizes, EM radiation burst energy, galaxy size), so getting 20% seems like the hard part by a LOT. The possibility space is really vast. Like, alignment is a necessary seeming component here, but is by no means the main factor. We would need to be much luckier in ways we can not currently imagine.

"does not mean that Reality will be terribly impressed with how disjunctive the okay outcome bucket has been made to sound"

Correct. It is even more disjunctive than that; and Reality will be very unimpressed with your arguments about how likely the doom outcome should be.

@EliezerYudkowsky I feel like this is underspecified. It's entirely possible that this outcome will come about through careful alignment work, but the question seems to suggest that it will happen by accident.

@Elspeth If it comes about through careful alignment work, then some other possibility happened, not possibility E.

@EliezerYudkowsky People will be arguing which one of these outcomes has occurred after supposed AGI is invented, which will be at a time when none of the outcomes has actually occurred because what we have is not exactly AGI and is not exactly aligned either. People will think wrongly that the critical window has passed and that the threat was overblown.

@Krantz want to exit our positions here? i put up a limit right next to market price. you should probably use that money on things that resolve sooner!

I appreciate the offer, but I feel pretty confident that this could resolve soon.

I'd feel more confident with this phrasing though:

Someone discovers a new paradigm in intelligence (collective, not artificial) that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much more alignable than giant inscrutable matrices of floating-point numbers.

What I'm hoping we build isn't actually AI, so I might lose for that technicality, but I am confident that it will be the reason for an outcome we might consider 'ok' and that's worth putting money on to me.

In general, I value the survival of my family and friends above winning any of these wagers.

The primary objective for wagering in these markets, for me, is to convey information to intelligent people that have an influence over research.

i think you should sell anyway!

I try not to do things unless I have reasons to do them.

Fun fact. If my system existed today, you could add the proposition 'Krantz should sell his position in K.' and the CI would then contrast and compare each of our reasons that support or deny that proposition, map them against each other and provide me with the optimal proposition (that exists on your ledger and not mine) that would be most effective in closing the inferential distance between our cruxes.

bought Ṁ1 YES

Isn't non-epistemic betting harmful to the reputability of prediction markets as a whole?

I bought E from 1% to 8% because maybe the CEV is natural — like, maybe the CEV is roughly hedonium (where "hedonium" is natural/simple and not related to quirks about homo sapiens) and a broad class of superintelligences would prioritize roughly hedonium. Maybe reflective paperclippers actually decide that qualia matter a ton and pursuing-hedonium is convergent. (This is mostly not decision-relevant.)

This seems likely to be a poorer predicter than usual - as you can't collect if you're dead

I am trying to do this differently here : https://manifold.markets/dionisos/if-we-survive-general-artificial-in-z3suausl60

bought Ṁ50 NO

Which options would be the most popular among ML researchers who aren't concerned about AI risk? From what I've read, the top choices would be "solving alignment is easy" (C or J) or "alignment isn't even a problem" (E?).

Yes, and G too I think.

@dionisos good point, G is probably the position of the "there's no such thing as intelligence" crowd

bought Ṁ1 YES

Two questions:

Why is this market suddenly insanely erratic this past week?
Why are so many semi-plausible sounding options being repeatedly bought down to ludicrously low odds <0.5% when you'd expect almost all to be within an order of magnitude of the base rate uniform distribution across options of ~6%?

opened a Ṁ25 NO at 24% order

Some of the options like H seem... logically possible yes, but a bit out there.

As for #2 that's why I bet up E a bit. I find it the most plausible contingent on a slightly superhuman AGI happening within the next 100 years, since it's the only real "it didn't work, but nothing that bad" happened option