If Artificial General Intelligence has an okay outcome, what will be the reason?
Basic
199
80k
2200
24%
Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out
17%
AIs will not have utility functions (in the same sense that humans do not), their goals such as they are will be relatively humanlike, and they will be "computerish" and generally weakly motivated compared to humans.
11%
Alignment is not properly solved, but core human values are simple enough that partial alignment techniques can impart these robustly. Despite caring about other things, it is relatively cheap for AGI to satisfy human values.
7%
Other
4%
Someone solves agent foundations
4%
Humans become transhuman through other means before AGI happens
1%
A lot of humans participate in a slow scalable oversight-style system, which is pivotally used/solves alignment enough

Duplicate of https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence with user-submitted answers. An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.

Get Ṁ600 play money
Sort by:
bought Ṁ10 Moral Realism is tru... YES

anyone know what's going on with the unbettable 0% options returning NaN?

It seems this market is heavily suffering from being linked when many of the options are not mutually exclusive

yeah

big benefit of all possibilities being written by the same person is there's less of that

Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out
bought Ṁ10 Yudkowsky is trying ... NO

@CalebW In your opinion, what would be the right problem, methods, world model, and thinking? The vagueness of this option seems to turn it into a grab bag akin to "because of a reason"

Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out

I couldn't have phrased this better myself

Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out

Just be be clear, it's an unfortunate situation.

"Corrigibility" is a bit more mathematically straightforward than was initially presumed, in the sense that we can expect it to occur, and is relatively easy to predict, even under less-than-ideal conditions.

I just submitted this answer, and it is now at 23% and in second place. May I possibly have caused a bug to happen?

@ThothHermes No, that always happens when Other has a high probability.

@Multicore The probabilities don't match what I'd intuitively expect.

bought Ṁ1 Answer #bcff919426ce YES

@ThothHermes I just bought $M 1 of a ~0% answer and it jumped to 18%. This doesn't feel right given there's ~$M 11k in the market. Maybe the $M 5.5k subsidy do weird things?

bought Ṁ1 Answer #3eadfba9dbc7 YES

Ah, this was a market in the old DPM style I think. They recently ran a script to update them all to the new multiple choice format with an "other" option but their liquidity is low so they'll behave strangely.

bought Ṁ1 Answer #1ea9988fff63 YES

I suggest trading in the version of this question that Eliezer made with that new format when it came out:

But to make things less weird here I'll add a small subsidy.

Every single thing that computers have done over the past 20 years seems incredibly difficult until suddenly people look back at how simple that was to solve.

Why would people think that "human values" are any different, and that they are some extremely complex thing that are impossible to represent concisely?

@SteveSokolowski So much of this depends on timescale, and how "humanlike" you need this to be. Near-term, I'm sure technical methods can improve such representations. But the far-future's alien politics will have little regard, for whatever lobby is "human values".

Especially the more distinctively-21st-century-human values. They are contextual and incoherent. Though we do have some core wants, like kinship, avoiding danger, exploration, resource gathering, etc. Things that persist because they are functional, and selected for. But those hardly cement anything humanlike into the future.

What plausible action is there, to make black-hole-farmers respect our wishes? It would be like Ardipithecus stopping us from paving roads. They could fantasize we'll secretly be like them, in some deep way. In some ways, yes. But of all the life that will ever be, almost none of it has much to do with humans. And what actually drives them shouldn't be described as "human values".

I seem to be, simultaneously, way more optimistic about alignment than many EAs (in the short-term), yet also way more pessimistic about that in the long-term. I don't know why some think "our values" will have greater relevance.

Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.

@SteveSokolowski Should it make us live longer? Should it eliminate diseases? Which ones? Should it prevent crimes? Redistribute wealth? Weigh in on social policy or norms? I doubt satisfactory answers exist.

@AdamAlexander It is true that it's difficult for people to figure out what to program the software to do, but isn't that what humans have always done? Humans have always had different values and competing values continue to wax and wane.

The "foom" arguments people are putting forth are just unrealistic - there isn't enough power generation capacity in the world to do that. In the meantime, we'll see a slow buildup that looks sort of like things do now, as electricity shortages limit the ability of any one person to impose his or her values on the world.

@SteveSokolowski I don't making an AI capable of acting on any of the difficult questions I listed would require prohibitively mich electricity. Certainly very little in comparison to how much people might like it to take one side or another and act. The prospect that within a decade or two, an AI could use less electricity than a hospital and effect much more life-lengthening seems extremely likely to me, and it's certainly not the limit case of difficult bio ethical questions people want to take action on. Taking the disruption to essays in schools as an example, I expect many disruptive and ethically thorny decisions to be made with little forethought, and I expect the magnitude of potential consequences to increase dramatically.

@ScroogeMcDuck

> Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.

I think it is acceptable that human go extinct, or biological life, or even individuality or life altogether. But I think it is strange to not care about human values.

In my point of view, we should avoid hell like futures at all cost, and this is certainly a human value.

And I don't feel like the current probability we avoid it is small enough.

After that, I think it would still be a waste to lose some other things for eternity, just because we were impatient and unwise, and we didn’t want to wait even a hundred years (which is mostly nothing on theses scales), to get a better grasp on security and what we was doing and wanted to do.

@dionisos Sure, some things are worth fighting for. But lots of our values don't seem like avoiding the unambiguously hell-like futures. I won't try elaborating on them here.

Probably lots of sacred things won't survive, and were specific to our time/place. Though I don't expect most people to stop feeling anxious about it. I'm sure their assertiveness is even adaptive, at some dose. But to me it's a bit like people in year 1100 trying to "advise" us today. I could cherry pick some things to agree with Year 1100 people on. But I don't really expect them to have good advice for us. That's similar to how I feel about big interventions we might try on the Year Million culture.

/Shrug

...And with that, I wish you a Happy New Year!

@ScroogeMcDuck Happy New Year :-) !

@LordWilmgaddark Why is this at 19%, and 2nd place behind the meme answer? Just because the market isn't serious in general? Or are people legitimately thinking an AI smart enough to conceive of and attempt a takeover, smarter than any individual human, would be dumb enough to try without a ridiculously massive clear advantage?

They'll know and understand things like comparative advantage, tail-end risk management, game theory, etc., far better and more easily than people because they won't have the human-specific cognitive biases that make many important modern considerations like these unintuitive.

@DavidHiggs

https://www.lesswrong.com/posts/B5CNPqYL7XcHzgzHc/a-weak-agi-may-attempt-an-unlikely-to-succeed-takeover

Not sure if I agree.

Bear in mind that this market is conditional on a miracle so all answers should seem miraculous.

@DavidHiggs I mean, I didn't actually think any of my answers were serious, but it doesn't seem impossible for something along the lines of a failed takeover to happen. Something less intelligent than a human, and having been exposed to a lot of things that mention harming humans, might go out and harm a bunch of humans, even if it doesn't have a great shot at success.
I kind of expect there to be some "warning shots" before the end that AI can be dangerous, although I don't know if it'll actually take the form of a takeover per se - it could just be things like AI-engineered viruses, or ramped up misinformation that makes it even more difficult to trust anything at all on the Internet, or even just humans using AI assistance in their own harmful-to-other-people schemes.
The part of my answer that felt rather unlikely to me, though, was the part where everyone wakes the hell up and starts doing something about it. On my mainline, some misuse of AI causes a few minor catastrophes every couple years, and people either shrug and say that the benefits outweigh the harms (and maybe they even do, in the short term), or take some action that looks like it's restricting AI development but isn't actually nearly enough to address the actual problems. And then they forget all about it.

Unless, like, it's actually something on the caliber of some weak AI getting access to nuclear weapons and using them to kill a billion people (but not the other seven billion) - that would wake people up, I think. But I doubt anything that serious would happen; between able to kill more than a couple thousand people and able to kill everyone is a narrow range of capability.

@LordWilmgaddark There's also the question of whether a nuclear exchange that kills billions prevents us from achieving 20%+ of maximum score.