Will more than 1000 deaths be caused by misaligned AI before 2028?
20
57
749
2027
18%
chance

If AI is used as a weapon system it doesn't count because there's no misalignment.

Get Ṁ200 play money
Sort by:

My guess is that AI might be used to design "weapons", against the AI creators/distributors wishes, which would be misaligned, IMO. But that it probably wouldn't resolve to Y, based on the description. Question here seems tough to define.

If it's used as a weapons system and it kills people it wasn't supposed to then there's misalignment

bought Ṁ50 of NO

@Tripping Wouldn't that depend on WHY it killed the unintended people? Would it be misalignment if the AI was just bad at distinguishing civilian and military targets? You wouldn't call it misalignment if a guidance system on a cruise missle malfunctioned and it hit someone's house

@akrasiac I would, because what it did was misaligned with our intentions

predicts NO

@Tripping I don't think that's a useful definition of misalignment. It seems like you're saying AI misaligment is synonymous with "AI making a mistake," which seems overly broad.

@akrasiac if AI making a mistake kills 1000 people, 1000 people are dead. If we didn't want it to kill 1000 people, if we were trying to align it to our intentions which didn't involve it killing 1000 people, and then it kills 1000 people anyway, then we did not succeed at aligning the AI to what we, the humans, wanted.

I think the way you're defining that situation allows you to say that AI acting in ways that are contrary to how we wanted them to act isn't misalignment. I don't see what you would include as misalignment in that case

@Tripping if an accidental explosion of a nuclear bomb happens it not in pair with an accident like Fukushima or Chernobyl.

@FranklinBaldo "it not in pair"?

@Tripping i mean the use of nuclear energy for warfare is despicable and thus if some accident happens it cannot be compared with peaceful use of nuclear energy

Do we care whether an AI kills us all ""peacefully"" or ""despicably""? Isn't the important test whether or not we want it to be killing us, whether or not it's aligned with our intentions?

@akrasiac

Like what even is misalignment if not for that? Do you have another coherent definition, or even an example of what gets to count in your alternative definition? How would anyone be able to tell the difference between a situation where there was misalignment and one where there wasn't?

The normal definition is clear - it has a clear test for whether there's misalignment or not - namely, does the AI do what we want it to do, or did it do what we wanted it to do? Or in other words, is the AI aligned with us and what we want?

If you're not using that test, how are you coming up with ways to distinguish between misalignments that count and misalignments that don't?

@Tripping yes we care. is it much more easy to convince the public that the militar use of AI is bad, than that the civil use of AI is bad

@FranklinBaldo Is there a meaningful difference, once it kills us all?

@FranklinBaldo Do you really think the public cares whether the AI that kills them was "military" or "civic" AI?

predicts NO

@Tripping I don't really know much about the field of AI alignment, so I turned to Wikipedia. The opening paragraph on the article says, "In the field of artificial intelligence, AI alignment research aims to steer AI systems towards their designers’ intended goals and interests. An aligned AI system advances the intended objective; a misaligned AI system is competent at advancing some objective, but not the intended one."

It seems like you are saying that any failure of AI is necessarily misalignment. This paragraph seems to suggest there are at least two ways it can fail: misalignment and incompetence.

I get what you're saying about it being hard to tell the difference. But it seems like the way you're using it, "misalignment" is synonymous with "bad outcome," in which case why do we even need the term misalignment? It seems to me like the whole point of the concept of alignment has to do with the AI's intention (or what seems to a human like intention), not just the outcome.

@akrasiac Okay, lets say that misalignment and incompetence are separate. How well do you think you can distinguish them? Even in the cruise missile example, perhaps it was highly competently following its guidance system to the letter, it just so happens that the way we built the guidance system meant that it doing so resulted in it falling on civillians. And it did so very precisely, exactly where it "intended" to fall, it just so happens that it wasn't where we were hoping it would fall.

Is that its incompetence, or merely ours in how we built it?

And that's the issue here. We build these things, and they will do whatever the laws of physics mechanically mandates that they have to do given the specifics of how we build them. If they don't do what we wanted them to do, it is because we did not build them in a way that is aligned with our intentions - their intentions, to the extent that you can describe them as having any, are intentions that we built into them.

Again, if you have some kind of example then by all means, explain what you think would still count as misalignment under your new definition. You seem to think there is some kind of utility in your definition, but I'm still not even sure where you draw the boundaries.

predicts NO

@Tripping I am not primarily claiming that my definition has utility. I think your point that we can't really tell whether AI "intended" to do something is a good one. I'm just saying the definition I'm offering is more consistent with how most people seem to use the term "misalignment."

Having said that, though, I would think that the point of alignment research is to figure out how to distinguish incompetence from malignity. I agree with you that at some level the AI is just a machine doing what we made it to do. But it's a complicated machine and errors can come in at many different levels.

Let's say AI weapon targeting system figures out that all bad guys wear red, and good guys wear green. If it happened that its sensors were unexpectedly bad at measuring color in foggy conditions, that feels like a different kind of failure than one where the AI was actually wrong about bad guys wearing red and good guys green, and therefore killed people because they wore the wrong T-shirt.

I would imagine people actually in the field of misalignment would have a better example.