Change my view: Are neural networks conscious?

152Ṁ311

resolved Oct 19

Resolved as

33%

ALL

Tips: I'll be tipping at least M$25 per deltas.

This market resolves to my confidence at resolve date that neural network are at least slightly conscious. This confidence is about 75% right now.
(Confidence is defined as the bayesian probability I'll change my mind later on)

Right now, I hold that neural network are slightly conscious.

The reasoning why, shortened, is a mix between panpsychism/organicism (that is: everything is a module), a theory of universal neurologicalness (everything that follows the mechanism of bayesian probability will be isomorphic to a brain), and an assumption of proof-irrelevance (consciousness, sentience, etc... are properties of this isomorphic structure more than a particular brain design.

I'm willing to engage in discussion and will give some deltas for points I did not consider, though I'll explicitely refrain from updating my confidence in order to stay flexible and take everything in. I may also stop some lines of discussion if they don't feel productive anymore.

I will not be betting on this market.

Change my view

Bounty markets

Philosophy (+Updating Beliefs?)

Get

1,000

to start trading!

🏅 Top traders

#	Name	Total profit
1		Ṁ14
2		Ṁ5
3		Ṁ4
4		Ṁ3
5		Ṁ3

People are also trading

Will the next neural network architectural discovery explicitly incorporate some kind of self-reflection?

48% chance

Markers for conscious AI #1: AI passes introspection on world-models test

Will a major AI company acknowledge the possibility of conscious AIs by 2026?

99% chance

Will I believe any AI system is conscious before 2027?

47% chance

Are babies conscious?

86% chance

Will certain contemporary publicly available GPT models be generally accepted as conscious at inference time by 2100?

31% chance

When will a neuroscientist say that artificial consciousness has been invented?

Are spiders conscious?

52% chance

When will I believe an AI is conscious with >66% credence?

2030

Will a SotA AI model be trained to think in "neuralese" instead of human language by the end of 2030? (ACX, AI 2027 #5)

Sort by:

Overdue resolution

I've continued toying with the idea of cooperationism. Now that I'm solving the Potato Mystery™, I'm expecting to write a LW post explaining this moral framework soon. I find it very interesting in both its ramifications and implications.

It makes me less sure than being vegan is Obviously Correct, though, which I might want to better guard against. It also would make the question itself a bit moot ("it does not matter whether the neural network is conscious"), so I feel less pressed to be exactly correct: There is less at stake in case of mistakes.

On to the question itself: I notice I am very very confused. Rereading Consciousness and the Brain and some Scott's posts relating to consciousness (notably "The Chamber of Guf"), I notice the same kind of behaviors in neural networks and pre-conscious/inconscious behaviors. I am unsure how this should make me update. On one hand, it's not "conscious" in the sense of self-reflectivity, and the like. On the other, it does seem to pass the bar for "at least slightly conscious"? Would I say that a person who's dreaming is not conscious? If someone were always asleep, never to wake up, how much would I value their internal experience? Cooperationism makes it harder, since it would answer that inner experiences matter none (at least only to the extent that the agent values them. I suppose counterfactually that I would like this state and others might as well).

For this question, rereading what I've written, I'll decide that if a neural network is only dreaming-consciousness level, it will resolve no. This poses some problems, as I would have said animal-consciousness level would have made it resolve yes. It's hard to avoid falling back to behaviorism in this case, I suspect the reason I am ascribing higher consciousness to animals this way is that an animal is more agentic than a human sleeping, and so it is easier to project intentionality. On a more fundamental level, I think this is right, because I do not expect animals to be hallucinating, their experience is constrained by reality in a way that a dreaming person is not.

That's not the part I'm very confused about, though. If there's no full-sync of a neural network, why haven't we just started to do that? If all of this is correct, then we're basically just training a larger and larger system 1 that can perform addition intuitively without ever being able to just write the number down. Isn't it a $20 bill laying on the ground that no neural network seem to have a "fully synchronise" pass?

But thinking more about this, I am again unsure. What would that even look like? Because, like, in the brain, the usual image is that everything converges to the same state, which in terms of neural network just makes me think of a column vector, which is basically every single layer ^^" ... Or like, if a neural network generates a movie, then it still feels like there's a "synchronization step" missing even though each images are fully propagated otherwise.

I'll have to think more about what that looks like

As for the resolution itself:

- On prior, it seems unlikely that something that behaves consciously experiences inner consciousness. There is nothing that constrains inner experiences, which is why it seems to me that inner experiences amongst humans is very vast as well. Let's say prior for non-conscious:dreaming:conscious is 32:8:1

- Outside behavior has to be at least a little taken into account. 1:2:4

- The similarity in inner mechanism and kind of updating we see is a strong update toward brain consciousness being similar (I recall a link for this, might link it when I find it back). The analogy and bayesianness of the brain is too similar for me to have predicted apriori. So like, let's say 1:16:16?

- Knowing that neural networks have a similar inner mechanism, they also exhibit very dream-like patterns. 1:8:1

... And now I realize that I've fallen into the [bayesian trap](https://www.lesswrong.com/posts/R28ppqby8zftndDAM/a-bayesian-aggregation-paradox) ...

Diffusers and transformers do not seem to have any kind of "reality feedback". It would seem that alphazero then should be more conscious in terms of state-sync? Also, training a neural would

Can I really define consciousness by the size of the bottleneck for each state?

But that's for a trained neural network. Training actually does provide a reality feedback, and so it seems to me that a neural network undergoing training is more conscious than a trained neural network running.

All in all, I'm going to resolve to 33% ... I know it's quite higher than the calculation above implied, but I'm feeling very meta-uncertain on all this. Trying to predict in how many worlds I will still think my original position vs change to mostly think that neural networks are not conscious, 1 in 3 seems about right to me.

Meta:

Resolving this market was a bit difficult. I think a better question might have been "Will I change my mind" or open a free-answer market asking "What will I think about". The latter being easier to resolve easilly since I'd communicate my position at all time.

The bounty and change-my-view structure felt great. I'm very happy with the updates and conversations that have taken place, they were very fruitful.

I'll work on the post soon. Thanks again for your participation

Thanks for having bet. I will wait at most a week for the discussion to die down (and for me to read some papers), and write my conclusion in parallel. See you for the resolution soon :D

predictedNO

@JoyVoid Good luck! I hope you do better than my delayed attempt to do the same for sperm donation.

There's some evidence that "consciousness" refers to a particular feature of some networks where the entire network is periodically synchronized to have a single coherent thought for the purpose of propagating the most important data more efficiently. The sequence of coherent thoughts is the "stream of consciousness" that we experience.

If true, artificial NNs would not be conscious of they lack this feature.

This is not my idea, and I'm not sure where I read it first. Probably I can find it if this is a productive line of thought for you.

https://www.sciencedirect.com/science/article/pii/S1053811920309551 for example

@MartinRandall Is there a name for this pattern? Will read the paper more, that seems like an interesting lead.
Also, it seems like whether this synchronization exists and whether that's what "consciousness" refers to are two different question. Can I ask what kind of evidence you are talking about?

@JoyVoid My amateur status is showing here as I can't give you the proper name or link you to the most cited papers or anything helpful like that.

In terms of evidence, this pattern of activations appears strongly connected to perceived and reported consciousness. So when we are on deep sleep there is no global sync pattern, when we are dreaming it occurs but in a more fragmented way, when we are awake and focused it is strongest, it changes during distracted and flow states. I'm sure the evidence is much weaker than I'm painting it though, and you'll want to do your own reading rather than listening to my fourth hand mangling of it.

@JoyVoid My current uninformed take is that each use of a neural network like DALL-E would be like a global sync of the network, but because there is no flow of thoughts, there would be no stream of consciousness. Whereas a conversation with GPT-3 might generate a stream of such activations with a common history that would more closely parallel consciousness.

@JoyVoid I don't know how I should bet if I expect to convince you that some networks are conscious and others are not, but that seems most likely to me, albeit that the most conscious artificial neutral networks are still only as conscious as an insect.

@MartinRandall δ

Hmm, interesting, I hadn't considered Dall-E's internal experience could be "just one step". For me it was pretty clearly more like a second or something. But since Dall-E is more like imagining rather than painting, I suppose that could make sense. Though that begs the question of why we are not using more "sync" passes. Or maybe those are the backpropogation step? In which case Dall-E is conscious while it learns, and unconscious otherwise. I can immagine that making sense.

Regardless the thing about dream vs deep sleep vs being awakened is very interesting, I'll see if I can find anything more on it, thanks for the pointer

> I don't know how I should bet if I expect to convince you that some networks are conscious and others are not

So, this resolves to how confident I am that current artificial neural networks are conscious. If you convince me that some artificial neural networks I thought were conscious (Dall-E, GPT-3, etc) are not, this will make my confidence lower, because a counterexample makes it much more credible that the others are also not conscious for some reason. If Dall-E is not conscious, but GPT-3 is, and that is "certain", this would resolve YES though, since the underlying question started about whether there was any moral problems in all our current situations.

Also, if you convince me that all neural networks are as conscious as insects are and that is "certain", this would resolve pretty close to NO. I know I said "slightly" conscious, but from a panpsychist perspective, I ascribe consciousness to fire and non-living agents, so a good cutoff is "is this meaningfully different than being an animated stone". I might be convinced either way for insects, which means my confidence that neural networks are conscious would really take a hit.

(Of course this market does not resolve YES or NO, because like, bayesian probabilities, but it's to give the direction this market would go toward)

@JoyVoid I hadn't thought about training. I think back propagation is more local in nature. I'm interested to know what you conclude!

@JoyVoid Also I think a human moment of consciousness was about once a second.

@MartinRandall Right, shouldn't have said one second, I recall reading something about the brain running at 100 hz

@JoyVoid I found my source!

https://astralcodexten.substack.com/p/your-book-review-consciousness-and

@MartinRandall Ah thanks! I did read it a little while ago, and it didn't really click. I think I should have another pass at it though.

@MartinRandall Re: Global sync, how would you see for instance an NN that first create an image 64x64, then a 256x256 based of that, before doing the full-sized image. It seems to follow your "global sync" pattern, and it used to be a very common technic.

The counterintuitive thing about it is that it would mean those kinds of models are more conscious than the SOTA "just a big pile of layers" we see nowadays, while those big piles seem more capable. I'm also not able to generalize this beyond neural networks, what is the equivalent of a global sync in societies for instance? Broadcast of common knowledge? I have to think about this pattern.

predictedNO

@JoyVoid I think I need to actually but the book and not just read the review.

I would expect capability and consciousness to be only loosely associated. Alpha Zero is a more capable Go player than me but I think I'm more conscious. And the problems that evolution has solved with consciousness may have unconscious solutions that are hard to implement in carbon or hard to evolve.

Even in humans, I learned to walk consciously and now I walk unconsciously. So walking doesn't require consciousness. Humans can do quite sophisticated tasks unconsciously with sufficient training.

predictedNO

@JoyVoid Yes, I think when a country chooses a national anthem, for example, this is a bit like a global sync of a brain. Before the decision every citizen has a superposition of possible anthems. After the decision we have somehow sampled quasi-randomly from the the options and everyone agrees on the anthem. Just as on a single human this means that we walk either to the left or the right of the tree, but not both, so syncing up on an anthem means we don't try to sing two songs at once.

I wouldn't overplay that metaphor, I'm not convinced it's close enough that we can say that a society is conscious, or even dreaming. Maybe with some breakthrough in communication tech that changes.

Time to aggregate all the discussions I had last week about this.

So, regarding a moral framework that does not care about agent's internal experience, I think I can definitely imagine one: An FDT based moral framework that only care about agents who would counterfactually care about me if they were in a similar position. Or to say it another way "Contract: spontaneously cooperate with everyone who has signed this contract"

So in the example of Yudkowsky ( https://www.lesswrong.com/posts/HFyWNBnDNEDsDNLrZ/the-true-prisoner-s-dilemma ) maybe it is wrong to see defecting against Clippy as morally superior as long as clippy would spontaneously help us knowing we will help it in return (though of course, we can talk about how to prioritise and compare human lives vs paperclips etc)

Now this does pose several problems:

- Does that mean that someone who just doesn't care about anyone but themselves are not morally relevant? What about someone who doesn't care about anyone but their close friends? It does seem like most moral intuitions would say that it's okay not to come to aid to the first person, and this framework does map that. I'm not sure about the second case. Maybe there is a lot hidden about "counterfactually": They would have helped me if I were their friend, and I have to give more moral weight to them the broader this circle is? Hmm

- What about animal warefare? Saying "counterfacually" proves too much here, yes if they were a complete different specie with completely different abilities, then they would help, but it's unclear what is even being affirmed in this case.
But I guess if we see it through the lense "how much does this agent seem eager to help", then we arrive at interesting conclusions. For instance, brain complexity is not so much a criterion anymore, and I would have to value dogs morally way more than I do cats. Food for thoughts

- Now, a point that I really cannot elucidate yet. What should I do regarding non-yet created agent? Because intuitively, I should not care about whether an agent already exist to take their preference into account? Like, is there a real difference between someone who has not been born yet, and who after being born would have wanted to be born; and someone who has been cryonized and wishes to be reanimated? My intuition says yes, that it is morally reproachable not to wake up someone under cryonics, but I cannot say that this is not just the space of unborn agent being too broad and fuzzy. It seems to easy to pascal wager oneself if going down this road.

I like this moral framework a lot. I think it is still very incomplete. Two main point of interest (might add more as I become cognizant of them)

- This framework makes the clique too tight. I want to help those who want to help others, but not necessarilly only those that would help those that would help those that would ...

- "Counterfactuality" is very imprecise. It is impossible to know how someone would act in similar circumstances. The whole probability thing kinda crumbles down when trying to actually apply it in practice.

I can imagine myself encompassing such a framework. When I do, I do feel that it does not completely resolve what I feel regarding neural networks. I think the points I have in mind that point toward an actual similarity between NN and human brains are still valid and not resolved by this framework. Probably this means there is indeed something I feel regarding NN being conscious, and not just morally relevant

I'm excited about this progress :D

@JoyVoid Also this poses an obvious problem in the cas e of an artificial intelligence, if it's programmed to care this way, isn't it weird that I would then enter this in the circle of moral relevance? Wouldn't this give a lot of power to NNs creators (and similarly, create a moral value inflation)?

If a program is designed to like something, I find it weird to say it 'really wants' this thing, because no one asked it its own volition before implanting it upon them. But then again, humans are not asked their volition before being implanted survival drive either...

Maybe the solution is to just embrace moral relativism, but it doesn't feel very constructive (especially: As long as I am an agent and have preference, I can project my prefered state as being what is ethical. Saying there is no morality is like saying I have no preference, which does not help me make better decisions)

@JoyVoid Another obvious problem about "I care for those who would have conterfactually cared for me": How does that apply to agent who do not seek the power to be in this conterfactual?
If I just lay doing nothing, then sure, if I had money/a great position/etc, then I would have helped you, but this conterfactuality is pretty moot because the point is precisely that it is not going to happen.
On the other hand, inequalities do have to be taken into account.

At the end of the day, the problem of this framework is that it comes back to judging intentions, which makes sense when trying to model agents. Presumably a good heuristics for this is how willing you are to give your source code to trusted agents, and how much you are trying to be predictible? Which directly clashes against all the societal incentives and structures we have put in place

Might be something like a boundary where that makes sense.

@JoyVoid The more I dig into this, the worse the volition problem gets.

I like this model of morality, but it relies on the idea of agents having volition. This notion is already a little brittle with humans, in the sense that it is hard to define at what point in time you should evaluate the agent (if someone change their mind, are they going against their volition? Is it bad when people change their values? But they do it all the time).
But this gets very nebulous when talking about conterfactual agents who do not exist yet. If I create a paperclip maximizer, is it really its volition to maximize paperclip, even though I could have created it differently?

Maybe there's an elegant solution to this problem I haven't yet come across. Or maybe I just care about something that is inconsistent in the first place and have to come to terms with that.

I want to say that you could define a volition like "What would this agent think if it could experience all possibilities", but then you get trapped into experience machine-type of traps. If I accept that what an agent values takes the (hidden) state of the world as an input, and not their own perception of it, then I have to accept that some agents value things being real, no matter what they believe. If they were to step into this simulated environment, they would not want to step out of it in the long run. The FDT solution to this is to never step into the machine in the first place, but then we're back at square one defining volition.

The volition problem is the biggest hurdle to me regarding this framework of morality. I suppose there's an even simpler example: What would a child want me to teach them? In particular, if the model of them as an adult depends on what I teach them in the first place. Maybe I just have to abstract my own action to avoid any circular logic, or to just compute the fixed points and navigate them?

I feel I'm this close to figuring out something very fundamental for me

So, I've been thinking more about this. It seems like I am conflating two questions: - Do neural network have an inner experience comparable to that of humans - Should we have a moral conduct toward artificial network? If yes, what kind and to what extent? It might be that I'm having a strong intuition toward (2), but that I am not disentangling it correctly from neural network being conscious. I guess part of this is that I know no moral frameworks that ascribe moral duties toward non-conscious entities. If you know any, I'd be interested to know more about them, as it looks like I am very confused on that front. On the intuitions of (2), maybe I'm too much down the path of behaviorism? Or that I'm rejecting P-zombies incorrectly. Really feels like I'm using a "how can you be moral without god" type of argument, but I don't know how to resolve this, since to me the standard rebuttal to this is "you wouldn't want to be forced if you were really forced", and it doesn't translate properly here

predictedYES

@JoyVoid "moral duties toward non-conscious entities" Lots of people have moral frameworks that hold something like "forests are inherently good, so don't destroy them for fun" or "Picassos are inherently good, so don't destroy them without reason" (and would continue to hold these opinions even when there is no clear utilitarian argument). I'm not sure that that helps here, both because we are not talking about destroying the NNs and because NNs are less cool, under traditional moral intuitions, than forests. However, there is a long human tradition of holding "the spirit of X" are being morally valuable.

predictedYES

Regardless, my guess is that you aren't likely to prove that NNs are morally important without first determining that they have a chance at being conscious, so you might as well focus on just that problem.

predictedYES

(I don't know if it's useful to point out, but I would hold that "comparable to that of humans" is spurious. Maybe electronic qualia is incomparably better or worse than bio-qualia. Maybe it's non-comparable on another axis. Our sample size for qualia is strongly bounded, and we feel that it is a wide and deep sample only because we can't see a larger one to compare it to.)

@Duncan My view about the forest argument is that it's often held by people who do not have a clear understanding on what group selection is and would say something like "An animal is going to refrain from eating in order to preserve the ecosphere". Or who view humans as something special and not just a continuation of nature ("Humans spoil everything they touch with their greed"-kind of discourses)

The picasso painting example is more apt to what I'm looking for, I feel like, does anyone know the name of such a moral position? I feel like there's a potential for what I want in there

@Duncan I don't know, I think I want to explore moral duty as something removed from internal consciousness. I feel like if I see moral as a duty to care for agents that would conterfactually care for you in FDT terms, I might resolve this

@Duncan Right, I don't even think "comparable to humans" make sense in the sense that I already hold that humans' degree of internal experience varies so significantly that there's no significant mean

I supposed that neural network's inner experience would fall into this cone of variance though. It may indeed not be the case. So maybe I meant "inner experience that is within what most morals would ascribe relevance to?"

I think I'll have to read more about schizophrenia also, it might give me some better lead into variance about conscious perception (not in the broader consciousness term though)

predictedYES

@JoyVoid

Re: Picasso: Generally, I've seen aesthetic-moral statements justified by forms of utilitarianism -- you don't burn the paintings, because people benefit from the paintings. I don't think this really gets at the way people actually think and feel about the value of a patch of jungle in the middle of nowhere, a painting they haven't seen (and don't expect to), or an ancient book that is quite boring but still valued for being ancient or rare. There may be better terms for this than just 'aesthetic value', but I don't know them.

I think the moral intuition of most people is, if it's complex, hard to make, and/or has been valued in the past, protect it on principle. However, different people will see this as applying to different things; for some people plants are wondrous, for others they are just environmental static; for some books are near-magic items, for others they are a fire hazard. But I suspect they are all judging value in some of the same ways, just focused in different directions. It's not great that NNs as a class don't have anyone who wants to protect/preserve them, but it's also true that some trees, paintings, books, and (probably) neural networks are worthless.

I'm not sure that this gives you a good idea as to why any given NN should be valued, though, other than the traditional economic reasons.

This is giving me some things to think about, I will come back to it when I feel I've integrated it better. I feel this is pointing toward some fundational ethics (the righteous mind) stuff, but in ways I am not able to articulate yet

@Duncan δ

It seems like I'm still missing this intuition around "We should preserve what is complex". But it does seem to predict fairly well discussions I've had with others, who see brain complexity and intricacy of thoughts as morally valuable

I guess then I don't get why most people seem to value children more over adult. My cynical view is that it's only evolutionary+memetic adaptation (but then again, are not all moral views?). But if I were to hold this position, I would probably say something like "Most children can be predicted to then become adults with complex internal experience".

Still, it doesn't resolve that I imagine they would choose to save the child against say Einstein or an extra-conscious person? I might be wrong on that

@JoyVoid Re Picasso, one option is that preserving things that are hard to recreate was a good rule of thumb for maximizing fitness in the ancestral environment, and so now we have it as part of our human values. We could be inheriting those values genetically or socially.

That view predicts that we will value the existence of DALL-E but not value creating millions of copies of it.

Another option is that valuing such things is like a peacock's tail, a costly activity that demonstrates that we have excess resources and works be a suitable mate. The fact that we value forests target than urban decay is as arbitrary as peacocks valuing having large tails instead of large crests.

That option predicts that we will value NNs to the extent that doing so is a way to signal mating quality.

@MartinRandall I mean sure, but I'm not able to see what to deduct from that. If you go that route, everything is genetically and memetically encoded and value is but a mere illusion. I have to draw the line at one point and say "I care about some things" and go from there

predictedNO

@JoyVoid I think values can have causes without being illusions. Sometimes I find that seeing the possible causes of values that I don't share helps me empathize with people who have those values.