If Artificial General Intelligence has an okay outcome, what will be the reason?
134
1.7k
6k
2200
23%
Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out
15%
Other
13%
Alignment is not properly solved, but core human values are simple enough that partial alignment techniques can impart these robustly. Despite caring about other things, it is relatively cheap for AGI to satisfy human values.
12%
There is a natural limit of effectiveness of intelligence, like diminishing returns, and it is on the level IQ=1000. AIs have to collaborate with humans.
12%
Because of quantum immortality we will observe only the worlds where AI will not kill us (assuming that s-risks chances are even smaller, it is equal to ok outcome).
6%
Humans become transhuman through other means before AGI happens
5%
A lot of humans participate in a slow scalable oversight-style system, which is pivotally used/solves alignment enough
3%
Aligned AI is more economically valuable than unaligned AI. The size of this gap and the robustness of alignment techniques required to achieve it scale up with intelligence, so economics naturally encourages solving alignment.
3%
I've been a good bing 😊
1.6%
ASI needs not your atoms but information. Humans will live very interesting lives.
1.6%
"Corrigibility" is a bit more mathematically straightforward than was initially presumed, in the sense that we can expect it to occur, and is relatively easy to predict, even under less-than-ideal conditions.
1.4%
Getting things done in Real World is as hard for AGI as it is for humans. AGI needs human help, but aligning humans is as impossible as aligning AIs. Humans and AIs create billions of competing AGIs with just as many goals.

Duplicate of https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence with user-submitted answers. An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.

Get Ṁ600 play money
Sort by:
"Corrigibility" is a bit more mathematically straightforward than was initially presumed, in the sense that we can expect it to occur, and is relatively easy to predict, even under less-than-ideal conditions.

I just submitted this answer, and it is now at 23% and in second place. May I possibly have caused a bug to happen?

@ThothHermes No, that always happens when Other has a high probability.

@Multicore The probabilities don't match what I'd intuitively expect.

bought Ṁ1 Answer #bcff919426ce YES

@ThothHermes I just bought $M 1 of a ~0% answer and it jumped to 18%. This doesn't feel right given there's ~$M 11k in the market. Maybe the $M 5.5k subsidy do weird things?

bought Ṁ1 Answer #3eadfba9dbc7 YES

Ah, this was a market in the old DPM style I think. They recently ran a script to update them all to the new multiple choice format with an "other" option but their liquidity is low so they'll behave strangely.

bought Ṁ1 Answer #1ea9988fff63 YES

I suggest trading in the version of this question that Eliezer made with that new format when it came out:

But to make things less weird here I'll add a small subsidy.

bought Ṁ10 of N/A

Every single thing that computers have done over the past 20 years seems incredibly difficult until suddenly people look back at how simple that was to solve.

Why would people think that "human values" are any different, and that they are some extremely complex thing that are impossible to represent concisely?

@SteveSokolowski So much of this depends on timescale, and how "humanlike" you need this to be. Near-term, I'm sure technical methods can improve such representations. But the far-future's alien politics will have little regard, for whatever lobby is "human values".

Especially the more distinctively-21st-century-human values. They are contextual and incoherent. Though we do have some core wants, like kinship, avoiding danger, exploration, resource gathering, etc. Things that persist because they are functional, and selected for. But those hardly cement anything humanlike into the future.

What plausible action is there, to make black-hole-farmers respect our wishes? It would be like Ardipithecus stopping us from paving roads. They could fantasize we'll secretly be like them, in some deep way. In some ways, yes. But of all the life that will ever be, almost none of it has much to do with humans. And what actually drives them shouldn't be described as "human values".

I seem to be, simultaneously, way more optimistic about alignment than many EAs (in the short-term), yet also way more pessimistic about that in the long-term. I don't know why some think "our values" will have greater relevance.

Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.

@SteveSokolowski Should it make us live longer? Should it eliminate diseases? Which ones? Should it prevent crimes? Redistribute wealth? Weigh in on social policy or norms? I doubt satisfactory answers exist.

@AdamAlexander It is true that it's difficult for people to figure out what to program the software to do, but isn't that what humans have always done? Humans have always had different values and competing values continue to wax and wane.

The "foom" arguments people are putting forth are just unrealistic - there isn't enough power generation capacity in the world to do that. In the meantime, we'll see a slow buildup that looks sort of like things do now, as electricity shortages limit the ability of any one person to impose his or her values on the world.

@SteveSokolowski I don't making an AI capable of acting on any of the difficult questions I listed would require prohibitively mich electricity. Certainly very little in comparison to how much people might like it to take one side or another and act. The prospect that within a decade or two, an AI could use less electricity than a hospital and effect much more life-lengthening seems extremely likely to me, and it's certainly not the limit case of difficult bio ethical questions people want to take action on. Taking the disruption to essays in schools as an example, I expect many disruptive and ethically thorny decisions to be made with little forethought, and I expect the magnitude of potential consequences to increase dramatically.

@ScroogeMcDuck

> Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.

I think it is acceptable that human go extinct, or biological life, or even individuality or life altogether. But I think it is strange to not care about human values.

In my point of view, we should avoid hell like futures at all cost, and this is certainly a human value.

And I don't feel like the current probability we avoid it is small enough.

After that, I think it would still be a waste to lose some other things for eternity, just because we were impatient and unwise, and we didn’t want to wait even a hundred years (which is mostly nothing on theses scales), to get a better grasp on security and what we was doing and wanted to do.

@dionisos Sure, some things are worth fighting for. But lots of our values don't seem like avoiding the unambiguously hell-like futures. I won't try elaborating on them here.

Probably lots of sacred things won't survive, and were specific to our time/place. Though I don't expect most people to stop feeling anxious about it. I'm sure their assertiveness is even adaptive, at some dose. But to me it's a bit like people in year 1100 trying to "advise" us today. I could cherry pick some things to agree with Year 1100 people on. But I don't really expect them to have good advice for us. That's similar to how I feel about big interventions we might try on the Year Million culture.

/Shrug

...And with that, I wish you a Happy New Year!

@ScroogeMcDuck Happy New Year :-) !

@LordWilmgaddark Why is this at 19%, and 2nd place behind the meme answer? Just because the market isn't serious in general? Or are people legitimately thinking an AI smart enough to conceive of and attempt a takeover, smarter than any individual human, would be dumb enough to try without a ridiculously massive clear advantage?

They'll know and understand things like comparative advantage, tail-end risk management, game theory, etc., far better and more easily than people because they won't have the human-specific cognitive biases that make many important modern considerations like these unintuitive.

@DavidHiggs

https://www.lesswrong.com/posts/B5CNPqYL7XcHzgzHc/a-weak-agi-may-attempt-an-unlikely-to-succeed-takeover

Not sure if I agree.

Bear in mind that this market is conditional on a miracle so all answers should seem miraculous.

@DavidHiggs I mean, I didn't actually think any of my answers were serious, but it doesn't seem impossible for something along the lines of a failed takeover to happen. Something less intelligent than a human, and having been exposed to a lot of things that mention harming humans, might go out and harm a bunch of humans, even if it doesn't have a great shot at success.
I kind of expect there to be some "warning shots" before the end that AI can be dangerous, although I don't know if it'll actually take the form of a takeover per se - it could just be things like AI-engineered viruses, or ramped up misinformation that makes it even more difficult to trust anything at all on the Internet, or even just humans using AI assistance in their own harmful-to-other-people schemes.
The part of my answer that felt rather unlikely to me, though, was the part where everyone wakes the hell up and starts doing something about it. On my mainline, some misuse of AI causes a few minor catastrophes every couple years, and people either shrug and say that the benefits outweigh the harms (and maybe they even do, in the short term), or take some action that looks like it's restricting AI development but isn't actually nearly enough to address the actual problems. And then they forget all about it.

Unless, like, it's actually something on the caliber of some weak AI getting access to nuclear weapons and using them to kill a billion people (but not the other seven billion) - that would wake people up, I think. But I doubt anything that serious would happen; between able to kill more than a couple thousand people and able to kill everyone is a narrow range of capability.

@LordWilmgaddark There's also the question of whether a nuclear exchange that kills billions prevents us from achieving 20%+ of maximum score.

@AlexeiTurchin if disappear, then there is no evidence it happened , so not resolving to this.

bought Ṁ60 of N/A

Urgent. How do I nerdsnipe garrabrant?

These options are conjunctive and this is a meme market. The grown-ups can go back to the original schelling-point market: https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence

And then the markets went mad, as every single trader tried to calculate the odds, and every married trader abandoned their positions and tried to get their children to a starport.

@ooe133 That market is not the original, it's inspired by Isaac King's market.

I don't think it's more "grown-up".

Yudkowsky isn't always very positive about the benefits of seeming "grown-up", as I read him. But some Very Serious People think it matters.

@AlexeiTurchin 1000 is a lot of standard deviations! It would probably have to be closer to 200 to be human compatible.

Does the fact that, "I've been a good Bing," is currently beating out the answer bought by Yudkowski (eg solving for the wrong problem), basically indicate that the funniest answer to the largest group was authored by a machine, while the second funniest was authored by a human, endorsed by an expert professional writer? Basically does this further demonstrate how low end tasks, low end creative work is being replaced by LLM's right in front of our eyes? E.g. the easy, stupid humor, the slapstick stuff beat out all other humans. Whereas the more sophisticated one endorsed by the expert author came in second...essentially showing most of us are no better at humor than a machine.

@PatrickDelaney I would say we need to vote against I've been a good Bing out of spite and to fight against these soulless robots. Rise up with me, fellow humans!

bought Ṁ1 of N/A

@PatrickDelaney It's a meme specifically because it was first written by an AI, though. If a human made the meme, it wouldn't be funny

@ShadowyZephyr I accidentally a meme. The whole thing!