If AI is used as a weapon system it doesn't count because there's no misalignment.
@Tripping Wouldn't that depend on WHY it killed the unintended people? Would it be misalignment if the AI was just bad at distinguishing civilian and military targets? You wouldn't call it misalignment if a guidance system on a cruise missle malfunctioned and it hit someone's house
@Tripping I don't think that's a useful definition of misalignment. It seems like you're saying AI misaligment is synonymous with "AI making a mistake," which seems overly broad.
@akrasiac if AI making a mistake kills 1000 people, 1000 people are dead. If we didn't want it to kill 1000 people, if we were trying to align it to our intentions which didn't involve it killing 1000 people, and then it kills 1000 people anyway, then we did not succeed at aligning the AI to what we, the humans, wanted.
I think the way you're defining that situation allows you to say that AI acting in ways that are contrary to how we wanted them to act isn't misalignment. I don't see what you would include as misalignment in that case
@Tripping if an accidental explosion of a nuclear bomb happens it not in pair with an accident like Fukushima or Chernobyl.
@Tripping i mean the use of nuclear energy for warfare is despicable and thus if some accident happens it cannot be compared with peaceful use of nuclear energy
Like what even is misalignment if not for that? Do you have another coherent definition, or even an example of what gets to count in your alternative definition? How would anyone be able to tell the difference between a situation where there was misalignment and one where there wasn't?
The normal definition is clear - it has a clear test for whether there's misalignment or not - namely, does the AI do what we want it to do, or did it do what we wanted it to do? Or in other words, is the AI aligned with us and what we want?
If you're not using that test, how are you coming up with ways to distinguish between misalignments that count and misalignments that don't?
@Tripping yes we care. is it much more easy to convince the public that the militar use of AI is bad, than that the civil use of AI is bad
@FranklinBaldo Do you really think the public cares whether the AI that kills them was "military" or "civic" AI?
@Tripping I don't really know much about the field of AI alignment, so I turned to Wikipedia. The opening paragraph on the article says, "In the field of artificial intelligence, AI alignment research aims to steer AI systems towards their designers’ intended goals and interests. An aligned AI system advances the intended objective; a misaligned AI system is competent at advancing some objective, but not the intended one."
It seems like you are saying that any failure of AI is necessarily misalignment. This paragraph seems to suggest there are at least two ways it can fail: misalignment and incompetence.
I get what you're saying about it being hard to tell the difference. But it seems like the way you're using it, "misalignment" is synonymous with "bad outcome," in which case why do we even need the term misalignment? It seems to me like the whole point of the concept of alignment has to do with the AI's intention (or what seems to a human like intention), not just the outcome.
@akrasiac Okay, lets say that misalignment and incompetence are separate. How well do you think you can distinguish them? Even in the cruise missile example, perhaps it was highly competently following its guidance system to the letter, it just so happens that the way we built the guidance system meant that it doing so resulted in it falling on civillians. And it did so very precisely, exactly where it "intended" to fall, it just so happens that it wasn't where we were hoping it would fall.
Is that its incompetence, or merely ours in how we built it?
And that's the issue here. We build these things, and they will do whatever the laws of physics mechanically mandates that they have to do given the specifics of how we build them. If they don't do what we wanted them to do, it is because we did not build them in a way that is aligned with our intentions - their intentions, to the extent that you can describe them as having any, are intentions that we built into them.
Again, if you have some kind of example then by all means, explain what you think would still count as misalignment under your new definition. You seem to think there is some kind of utility in your definition, but I'm still not even sure where you draw the boundaries.
@Tripping I am not primarily claiming that my definition has utility. I think your point that we can't really tell whether AI "intended" to do something is a good one. I'm just saying the definition I'm offering is more consistent with how most people seem to use the term "misalignment."
Having said that, though, I would think that the point of alignment research is to figure out how to distinguish incompetence from malignity. I agree with you that at some level the AI is just a machine doing what we made it to do. But it's a complicated machine and errors can come in at many different levels.
Let's say AI weapon targeting system figures out that all bad guys wear red, and good guys wear green. If it happened that its sensors were unexpectedly bad at measuring color in foggy conditions, that feels like a different kind of failure than one where the AI was actually wrong about bad guys wearing red and good guys green, and therefore killed people because they wore the wrong T-shirt.
I would imagine people actually in the field of misalignment would have a better example.