Rationality- From AI to Zombies

Page 91

by Eliezer Yudkowsky

I don’t think this is actually true of Chalmers, though. If Chalmers lacked self-honesty, he could make things a lot easier on himself.

(But just in case Chalmers is reading this and does have falsification-fear, I’ll point out that if epiphenomenalism is false, then there is some other explanation for that-which-we-call consciousness, and it will eventually be found, leaving Chalmers’s theory in ruins; so if Chalmers cares about his place in history, he has no motive to endorse epiphenomenalism unless he really thinks it’s true.)

Chalmers is one of the most frustrating philosophers I know. Sometimes I wonder if he’s pulling an Atheism Conquered. Chalmers does this really sharp analysis . . . and then turns left at the last minute. He lays out everything that’s wrong with the Zombie World scenario, and then, having reduced the whole argument to smithereens, calmly accepts it.

Chalmers does the same thing when he lays out, in calm detail, the problem with saying that our own beliefs in consciousness are justified, when our zombie twins say exactly the same thing for exactly the same reasons and are wrong.

On Chalmers’s theory, Chalmers’s saying that he believes in consciousness cannot be causally justified; the belief is not caused by the fact itself. In the absence of consciousness, Chalmers would write the same papers for the same reasons.

On epiphenomenalism, Chalmers’s saying that he believes in consciousness cannot be justified as the product of a process that systematically outputs true beliefs, because the zombie twin writes the same papers using the same systematic process and is wrong.

Chalmers admits this. Chalmers, in fact, explains the argument in great detail in his book. Okay, so Chalmers has solidly proven that he is not justified in believing in epiphenomenal consciousness, right? No. Chalmers writes:

Conscious experience lies at the center of our epistemic universe; we have access to it directly. This raises the question: what is it that justifies our beliefs about our experiences, if it is not a causal link to those experiences, and if it is not the mechanisms by which the beliefs are formed? I think the answer to this is clear: it is having the experiences that justifies the beliefs. For example, the very fact that I have a red experience now provides justification for my belief that I am having a red experience . . .

Because my zombie twin lacks experiences, he is in a very different epistemic situation from me, and his judgments lack the corresponding justification. It may be tempting to object that if my belief lies in the physical realm, its justification must lie in the physical realm; but this is a non sequitur. From the fact that there is no justification in the physical realm, one might conclude that the physical portion of me (my brain, say) is not justified in its belief. But the question is whether I am justified in the belief, not whether my brain is justified in the belief, and if property dualism is correct than there is more to me than my brain.

So—if I’ve got this thesis right—there’s a core you, above and beyond your brain, that believes it is not a zombie, and directly experiences not being a zombie; and so its beliefs are justified.

But Chalmers just wrote all that stuff down, in his very physical book, and so did the zombie-Chalmers.

The zombie Chalmers can’t have written the book because of the zombie’s core self above the brain; there must be some entirely different reason, within the laws of physics.

It follows that even if there is a part of Chalmers hidden away that is conscious and believes in consciousness, directly and without mediation, there is also a separable subspace of Chalmers—a causally closed cognitive subsystem that acts entirely within physics—and this “outer self” is what speaks Chalmers’s internal narrative, and writes papers on consciousness.

I do not see any way to evade the charge that, on Chalmers’s own theory, this separable outer Chalmers is deranged. This is the part of Chalmers that is the same in this world, or the Zombie World; and in either world it writes philosophy papers on consciousness for no valid reason. Chalmers’s philosophy papers are not output by that inner core of awareness and belief-in-awareness; they are output by the mere physics of the internal narrative that makes Chalmers’s fingers strike the keys of his computer.

And yet this deranged outer Chalmers is writing philosophy papers that just happen to be perfectly right, by a separate and additional miracle. Not a logically necessary miracle (then the Zombie World would not be logically possible). A physically contingent miracle, that happens to be true in what we think is our universe, even though science can never distinguish our universe from the Zombie World.

Or at least, that would seem to be the implication of what the self-confessedly deranged outer Chalmers is telling us.

I think I speak for all reductionists when I say Huh?

That’s not epicycles. That’s, “Planetary motions follow these epicycles—but epicycles don’t actually do anything—there’s something else that makes the planets move the same way the epicycles say they should, which I haven’t been able to explain—and by the way, I would say this even if there weren’t any epicycles.”

I have a nonstandard perspective on philosophy because I look at everything with an eye to designing an AI; specifically, a self-improving Artificial General Intelligence with stable motivational structure.

When I think about designing an AI, I ponder principles like probability theory, the Bayesian notion of evidence as differential diagnostic, and above all, reflective coherence. Any self-modifying AI that starts out in a reflectively inconsistent state won’t stay that way for long.

If a self-modifying AI looks at a part of itself that concludes “B” on condition A—a part of itself that writes “B” to memory whenever condition A is true—and the AI inspects this part, determines how it (causally) operates in the context of the larger universe, and the AI decides that this part systematically tends to write false data to memory, then the AI has found what appears to be a bug, and the AI will self-modify not to write “B” to the belief pool under condition A.

Any epistemological theory that disregards reflective coherence is not a good theory to use in constructing self-improving AI. This is a knockdown argument from my perspective, considering what I intend to actually use philosophy for. So I have to invent a reflectively coherent theory anyway. And when I do, by golly, reflective coherence turns out to make intuitive sense.

So that’s the unusual way in which I tend to think about these things. And now I look back at Chalmers:

The causally closed “outer Chalmers” (that is not influenced in any way by the “inner Chalmers” that has separate additional awareness and beliefs) must be carrying out some systematically unreliable, unwarranted operation which in some unexplained fashion causes the internal narrative to produce beliefs about an “inner Chalmers” that are correct for no logical reason in what happens to be our universe.

But there’s no possible warrant for the outer Chalmers or any reflectively coherent self-inspecting AI to believe in this mysterious correctness. A good AI design should, I think, look like a reflectively coherent intelligence embodied in a causal system, with a testable theory of how that selfsame causal system produces systematically accurate beliefs on the way to achieving its goals.

So the AI will scan Chalmers and see a closed causal cognitive system producing an internal narrative that is uttering nonsense. Nonsense that seems to have a high impact on what Chalmers thinks should be considered a morally valuable person.

This is not a necessary problem for Friendly AI theorists. It is only a problem if you happen to be an epiphenomenalist. If you believe either the reductionists (consciousness happens within the atoms) or the substance dualists (consciousness is causally potent immaterial stuff), people talking about consciousness are talking about something real, and a reflectively consistent Bayesian AI can see this by tracing back the chain of causality for what makes people say “consciousness.”

According to Chalmers, the causally closed cognitive system of Chalmers’s internal narrative is (mysterious
ly) malfunctioning in a way that, not by necessity, but just in our universe, miraculously happens to be correct. Furthermore, the internal narrative asserts “the internal narrative is mysteriously malfunctioning, but miraculously happens to be correctly echoing the justified thoughts of the epiphenomenal inner core,” and again, in our universe, miraculously happens to be correct.

Oh, come on!

Shouldn’t there come a point where you just give up on an idea? Where, on some raw intuitive level, you just go: What on Earth was I thinking?

Humanity has accumulated some broad experience with what correct theories of the world look like. This is not what a correct theory looks like.

“Argument from incredulity,” you say. Fine, you want it spelled out? The said Chalmersian theory postulates multiple unexplained complex miracles. This drives down its prior probability, by the conjunction rule of probability and Occam’s Razor. It is therefore dominated by at least two theories that postulate fewer miracles, namely:

Substance dualism: There is a stuff of consciousness which is not yet understood, an extraordinary super-physical stuff that visibly affects our world; and this stuff is what makes us talk about consciousness.

Not-quite-faith-based reductionism: That-which-we-name “consciousness” happens within physics, in a way not yet understood, just like what happened the last three thousand times humanity ran into something mysterious.

Your intuition that no material substance can possibly add up to consciousness is incorrect. If you actually knew exactly why you talk about consciousness, this would give you new insights, of a form you can’t now anticipate; and afterward you would realize that your arguments about normal physics having no room for consciousness were flawed.

Compare to:

Epiphenomenal property dualism: Matter has additional consciousness-properties which are not yet understood. These properties are epiphenomenal with respect to ordinarily observable physics—they make no difference to the motion of particles.

Separately, there exists a not-yet-understood reason within normal physics why philosophers talk about consciousness and invent theories of dual properties.

Miraculously, when philosophers talk about consciousness, the bridging laws of our world are exactly right to make this talk about consciousness correct, even though it arises from a malfunction (drawing of logically unwarranted conclusions) in the causally closed cognitive system that types philosophy papers.

I know I’m speaking from limited experience, here. But based on my limited experience, the Zombie Argument may be a candidate for the most deranged idea in all of philosophy.

There are times when, as a rationalist, you have to believe things that seem weird to you. Relativity seems weird, quantum mechanics seems weird, natural selection seems weird.

But these weirdnesses are pinned down by massive evidence. There’s a difference between believing something weird because science has confirmed it overwhelmingly—

—versus believing a proposition that seems downright deranged, because of a great big complicated philosophical argument centered around unspecified miracles and giant blank spots not even claimed to be understood—

—in a case where even if you accept everything that has been told to you so far, afterward the phenomenon will still seem like a mystery and still have the same quality of wondrous impenetrability that it had at the start.

The correct thing for a rationalist to say at this point, if all of David Chalmers’s arguments seem individually plausible—which they don’t seem to me—is:

“Okay . . . I don’t know how consciousness works . . . I admit that . . . and maybe I’m approaching the whole problem wrong, or asking the wrong questions . . . but this zombie business can’t possibly be right. The arguments aren’t nailed down enough to make me believe this—especially when accepting it won’t make me feel any less confused. On a core gut level, this just doesn’t look like the way reality could really really work.”

Mind you, I am not saying this is a substitute for careful analytic refutation of Chalmers’s thesis. System 1 is not a substitute for System 2, though it can help point the way. You still have to track down where the problems are specifically.

Chalmers wrote a big book, not all of which is available through free Google preview. I haven’t duplicated the long chains of argument where Chalmers lays out the arguments against himself in calm detail. I’ve just tried to tack on a final refutation of Chalmers’s last presented defense, which Chalmers has not yet countered to my knowledge. Hit the ball back into his court, as it were.

But, yes, on a core level, the sane thing to do when you see the conclusion of the zombie argument, is to say “That can’t possibly be right” and start looking for a flaw.

*

1. Chalmers, The Conscious Mind.

222

Zombie Responses

I’m a bit tired today, having stayed up until 3 a.m. writing yesterday’s >6,000-word essay on zombies, so today I’ll just reply to Richard, and tie up a loose end I spotted the next day.

(A) Richard Chappell writes:

A terminological note (to avoid unnecessary confusion): what you call “conceivable,” others of us would merely call “apparently conceivable.”

The gap between “I don’t see a contradiction yet” and “this is logically possible” is so huge (it’s NP-complete even in some simple-seeming cases) that you really should have two different words. As the zombie argument is boosted to the extent that this huge gap can be swept under the rug of minor terminological differences, I really think it would be a good idea to say “conceivable” versus “logically possible” or maybe even have a still more visible distinction. I can’t choose professional terminology that has already been established, but in a case like this, I might seriously refuse to use it.

Maybe I will say “apparently conceivable” for the kind of information that zombie advocates get by imagining Zombie Worlds, and “logically possible” for the kind of information that is established by exhibiting a complete model or logical proof. Note the size of the gap between the information you can get by closing your eyes and imagining zombies, and the information you need to carry the argument for epiphenomenalism.

That is, your view would be characterized as a form of Type-A materialism, the view that zombies are not even (genuinely) conceivable, let alone metaphysically possible.

Type-A materialism is a large bundle; you shouldn’t attribute the bundle to me until you see me agree with each of the parts. I think that someone who asks “What is consciousness?” is asking a legitimate question, has a legitimate demand for insight; I don’t necessarily think that the answer takes the form of “Here is this stuff that has all the properties you would attribute to consciousness, for such-and-such reason,” but may to some extent consist of insights that cause you to realize you were asking the question the wrong way.

This is not being eliminative about consciousness. It is being realistic about what kind of insights to expect, faced with a problem that (1) seems like it must have some solution, (2) seems like it cannot possibly have any solution, and (3) is being discussed in a fashion that has a great big dependence on the not-fully-understood ad-hoc architecture of human cognition.

(1) You haven’t, so far as I can tell, identified any logical contradiction in the description of the zombie world. You’ve just pointed out that it’s kind of strange. But there are many bizarre possible worlds out there. That’s no reason to posit an implicit contradiction. So it’s still completely mysterious to me what this alleged contradiction is supposed to be.

Okay, I’ll spell it out from a materialist standpoint:

The zombie world, by definition, contains all parts of our world that are within the closure of the “caused by” or “effect of” relation of any observable phenomenon. In particular, it contains the cause of my visibly saying, “I think therefore I am.”

When I focus my inward awareness on my inward awareness, I shortly thereafter experience
my internal narrative saying “I am focusing my inward awareness on my inward awareness,” and can, if I choose, say so out loud.

Intuitively, it sure seems like my inward awareness is causing my internal narrative to say certain things, and that my internal narrative can cause my lips to say certain things.

The word “consciousness,” if it has any meaning at all, refers to that-which-is or that-which-causes or that-which-makes-me-say-I-have inward awareness.

From (3) and (4) it would follow that if the zombie world is closed with respect to the causes of my saying “I think therefore I am,” the zombie world contains that which we refer to as “consciousness.”

By definition, the zombie world does not contain consciousness.

(3) seems to me to have a rather high probability of being empirically true. Therefore I evaluate a high empirical probability that the zombie world is logically impossible.

You can save the Zombie World by letting the cause of my internal narrative’s saying “I think therefore I am” be something entirely other than consciousness. In conjunction with the assumption that consciousness does exist, this is the part that struck me as deranged.

But if the above is conceivable, then isn’t the Zombie World conceivable?

No, because the two constructions of the Zombie World involve giving the word “consciousness” different empirical referents, like “water” in our world meaning H2O versus “water” in Putnam’s Twin Earth meaning XYZ. For the Zombie World to be logically possible, it does not suffice that, for all you knew about how the empirical world worked, the word “consciousness” could have referred to an epiphenomenon that is entirely different from the consciousness we know. The Zombie World lacks consciousness, not “consciousness”—it is a world without H2O, not a world without “water.” This is what is required to carry the empirical statement, “You could eliminate the referent of whatever is meant by ‘consciousness’ from our world, while keeping all the atoms in the same place.”

‹ Prev Next ›