Rationality- From AI to Zombies

Home > Science > Rationality- From AI to Zombies > Page 8
Rationality- From AI to Zombies Page 8

by Eliezer Yudkowsky


  Eddin, a Green, looked up at the blue sky and began to laugh cynically. The course of his world’s history came clear at last; even he couldn’t believe they’d been such fools. “Stupid,” Eddin said, “stupid, stupid, and all the time it was right here.” Hatred, murders, wars, and all along it was just a thing somewhere, that someone had written about like they’d write about any other thing. No poetry, no beauty, nothing that any sane person would ever care about, just one pointless thing that had been blown out of all proportion. Eddin leaned against the cave mouth wearily, trying to think of a way to prevent this information from blowing up the world, and wondering if they didn’t all deserve it.

  Ferris gasped involuntarily, frozen by sheer wonder and delight. Ferris’s eyes darted hungrily about, fastening on each sight in turn before moving reluctantly to the next; the blue sky, the white clouds, the vast unknown outside, full of places and things (and people?) that no Undergrounder had ever seen. “Oh, so that’s what color it is,” Ferris said, and went exploring.

  *

  1. Procopius, History of the Wars, ed. Henry B. Dewing, vol. 1 (Harvard University Press, 1914).

  2. Edward Gibbon, The History of the Decline and Fall of the Roman Empire, vol. 4 (J. & J. Harper, 1829).

  13

  Belief in Belief

  Carl Sagan once told a parable of someone who comes to us and claims: “There is a dragon in my garage.” Fascinating! We reply that we wish to see this dragon—let us set out at once for the garage! “But wait,” the claimant says to us, “it is an invisible dragon.”

  Now as Sagan points out, this doesn’t make the hypothesis unfalsifiable. Perhaps we go to the claimant’s garage, and although we see no dragon, we hear heavy breathing from no visible source; footprints mysteriously appear on the ground; and instruments show that something in the garage is consuming oxygen and breathing out carbon dioxide.

  But now suppose that we say to the claimant, “Okay, we’ll visit the garage and see if we can hear heavy breathing,” and the claimant quickly says no, it’s an inaudible dragon. We propose to measure carbon dioxide in the air, and the claimant says the dragon does not breathe. We propose to toss a bag of flour into the air to see if it outlines an invisible dragon, and the claimant immediately says, “The dragon is permeable to flour.”

  Carl Sagan used this parable to illustrate the classic moral that poor hypotheses need to do fast footwork to avoid falsification. But I tell this parable to make a different point: The claimant must have an accurate model of the situation somewhere in their mind, because they can anticipate, in advance, exactly which experimental results they’ll need to excuse.

  Some philosophers have been much confused by such scenarios, asking, “Does the claimant really believe there’s a dragon present, or not?” As if the human brain only had enough disk space to represent one belief at a time! Real minds are more tangled than that. There are different types of belief; not all beliefs are direct anticipations. The claimant clearly does not anticipate seeing anything unusual upon opening the garage door. Otherwise they wouldn’t make advance excuses. It may also be that the claimant’s pool of propositional beliefs contains There is a dragon in my garage. It may seem, to a rationalist, that these two beliefs should collide and conflict even though they are of different types. Yet it is a physical fact that you can write “The sky is green!” next to a picture of a blue sky without the paper bursting into flames.

  The rationalist virtue of empiricism is supposed to prevent us from making this class of mistake. We’re supposed to constantly ask our beliefs which experiences they predict, make them pay rent in anticipation. But the dragon-claimant’s problem runs deeper, and cannot be cured with such simple advice. It’s not exactly difficult to connect belief in a dragon to anticipated experience of the garage. If you believe there’s a dragon in your garage, then you can expect to open up the door and see a dragon. If you don’t see a dragon, then that means there’s no dragon in your garage. This is pretty straightforward. You can even try it with your own garage.

  No, this invisibility business is a symptom of something much worse.

  Depending on how your childhood went, you may remember a time period when you first began to doubt Santa Claus’s existence, but you still believed that you were supposed to believe in Santa Claus, so you tried to deny the doubts. As Daniel Dennett observes, where it is difficult to believe a thing, it is often much easier to believe that you ought to believe it. What does it mean to believe that the Ultimate Cosmic Sky is both perfectly blue and perfectly green? The statement is confusing; it’s not even clear what it would mean to believe it—what exactly would be believed, if you believed. You can much more easily believe that it is proper, that it is good and virtuous and beneficial, to believe that the Ultimate Cosmic Sky is both perfectly blue and perfectly green. Dennett calls this “belief in belief.”1

  And here things become complicated, as human minds are wont to do—I think even Dennett oversimplifies how this psychology works in practice. For one thing, if you believe in belief, you cannot admit to yourself that you only believe in belief, because it is virtuous to believe, not to believe in belief, and so if you only believe in belief, instead of believing, you are not virtuous. Nobody will admit to themselves, “I don’t believe the Ultimate Cosmic Sky is blue and green, but I believe I ought to believe it”—not unless they are unusually capable of acknowledging their own lack of virtue. People don’t believe in belief in belief, they just believe in belief.

  (Those who find this confusing may find it helpful to study mathematical logic, which trains one to make very sharp distinctions between the proposition P, a proof of P, and a proof that P is provable. There are similarly sharp distinctions between P, wanting P, believing P, wanting to believe P, and believing that you believe P.)

  There’s different kinds of belief in belief. You may believe in belief explicitly; you may recite in your deliberate stream of consciousness the verbal sentence “It is virtuous to believe that the Ultimate Cosmic Sky is perfectly blue and perfectly green.” (While also believing that you believe this, unless you are unusually capable of acknowledging your own lack of virtue.) But there are also less explicit forms of belief in belief. Maybe the dragon-claimant fears the public ridicule that they imagine will result if they publicly confess they were wrong (although, in fact, a rationalist would congratulate them, and others are more likely to ridicule the claimant if they go on claiming there’s a dragon in their garage). Maybe the dragon-claimant flinches away from the prospect of admitting to themselves that there is no dragon, because it conflicts with their self-image as the glorious discoverer of the dragon, who saw in their garage what all others had failed to see.

  If all our thoughts were deliberate verbal sentences like philosophers manipulate, the human mind would be a great deal easier for humans to understand. Fleeting mental images, unspoken flinches, desires acted upon without acknowledgement—these account for as much of ourselves as words.

  While I disagree with Dennett on some details and complications, I still think that Dennett’s notion of belief in belief is the key insight necessary to understand the dragon-claimant. But we need a wider concept of belief, not limited to verbal sentences. “Belief” should include unspoken anticipation-controllers. “Belief in belief” should include unspoken cognitive-behavior-guiders. It is not psychologically realistic to say, “The dragon-claimant does not believe there is a dragon in their garage; they believe it is beneficial to believe there is a dragon in their garage.” But it is realistic to say the dragon-claimant anticipates as if there is no dragon in their garage, and makes excuses as if they believed in the belief.

  You can possess an ordinary mental picture of your garage, with no dragons in it, which correctly predicts your experiences on opening the door, and never once think the verbal phrase There is no dragon in my garage. I even bet it’s happened to you—that when you open your garage door or bedroom door or whatever, and expect to see no dragons, no such ver
bal phrase runs through your mind.

  And to flinch away from giving up your belief in the dragon—or flinch away from giving up your self-image as a person who believes in the dragon—it is not necessary to explicitly think I want to believe there’s a dragon in my garage. It is only necessary to flinch away from the prospect of admitting you don’t believe.

  To correctly anticipate, in advance, which experimental results shall need to be excused, the dragon-claimant must (a) possess an accurate anticipation-controlling model somewhere in their mind, and (b) act cognitively to protect either (b1) their free-floating propositional belief in the dragon or (b2) their self-image of believing in the dragon.

  If someone believes in their belief in the dragon, and also believes in the dragon, the problem is much less severe. They will be willing to stick their neck out on experimental predictions, and perhaps even agree to give up the belief if the experimental prediction is wrong—although belief in belief can still interfere with this, if the belief itself is not absolutely confident. When someone makes up excuses in advance, it would seem to require that belief and belief in belief have become unsynchronized.

  *

  1. Daniel C. Dennett, Breaking the Spell: Religion as a Natural Phenomenon (Penguin, 2006).

  14

  Bayesian Judo

  You can have some fun with people whose anticipations get out of sync with what they believe they believe.

  I was once at a dinner party, trying to explain to a man what I did for a living, when he said: “I don’t believe Artificial Intelligence is possible because only God can make a soul.”

  At this point I must have been divinely inspired, because I instantly responded: “You mean if I can make an Artificial Intelligence, it proves your religion is false?”

  He said, “What?”

  I said, “Well, if your religion predicts that I can’t possibly make an Artificial Intelligence, then, if I make an Artificial Intelligence, it means your religion is false. Either your religion allows that it might be possible for me to build an AI; or, if I build an AI, that disproves your religion.”

  There was a pause, as the one realized he had just made his hypothesis vulnerable to falsification, and then he said, “Well, I didn’t mean that you couldn’t make an intelligence, just that it couldn’t be emotional in the same way we are.”

  I said, “So if I make an Artificial Intelligence that, without being deliberately preprogrammed with any sort of script, starts talking about an emotional life that sounds like ours, that means your religion is wrong.”

  He said, “Well, um, I guess we may have to agree to disagree on this.”

  I said: “No, we can’t, actually. There’s a theorem of rationality called Aumann’s Agreement Theorem which shows that no two rationalists can agree to disagree. If two people disagree with each other, at least one of them must be doing something wrong.”

  We went back and forth on this briefly. Finally, he said, “Well, I guess I was really trying to say that I don’t think you can make something eternal.”

  I said, “Well, I don’t think so either! I’m glad we were able to reach agreement on this, as Aumann’s Agreement Theorem requires.” I stretched out my hand, and he shook it, and then he wandered away.

  A woman who had stood nearby, listening to the conversation, said to me gravely, “That was beautiful.”

  “Thank you very much,” I said.

  *

  15

  Pretending to be Wise

  The hottest place in Hell is reserved for those who in time of crisis remain neutral.

  —Dante Alighieri, famous hell expert

  John F. Kennedy, misquoter

  It’s common to put on a show of neutrality or suspended judgment in order to signal that one is mature, wise, impartial, or just has a superior vantage point.

  An example would be the case of my parents, who respond to theological questions like “Why does ancient Egypt, which had good records on many other matters, lack any records of Jews having ever been there?” with “Oh, when I was your age, I also used to ask that sort of question, but now I’ve grown out of it.”

  Another example would be the principal who, faced with two children who were caught fighting on the playground, sternly says: “It doesn’t matter who started the fight, it only matters who ends it.” Of course it matters who started the fight. The principal may not have access to good information about this critical fact, but if so, the principal should say so, not dismiss the importance of who threw the first punch. Let a parent try punching the principal, and we’ll see how far “It doesn’t matter who started it” gets in front of a judge. But to adults it is just inconvenient that children fight, and it matters not at all to their convenience which child started it. It is only convenient that the fight end as rapidly as possible.

  A similar dynamic, I believe, governs the occasions in international diplomacy where Great Powers sternly tell smaller groups to stop that fighting right now. It doesn’t matter to the Great Power who started it—who provoked, or who responded disproportionately to provocation—because the Great Power’s ongoing inconvenience is only a function of the ongoing conflict. Oh, can’t Israel and Hamas just get along?

  This I call “pretending to be Wise.” Of course there are many ways to try and signal wisdom. But trying to signal wisdom by refusing to make guesses—refusing to sum up evidence—refusing to pass judgment—refusing to take sides—staying above the fray and looking down with a lofty and condescending gaze—which is to say, signaling wisdom by saying and doing nothing—well, that I find particularly pretentious.

  Paolo Freire said, “Washing one’s hands of the conflict between the powerful and the powerless means to side with the powerful, not to be neutral.”1 A playground is a great place to be a bully, and a terrible place to be a victim, if the teachers don’t care who started it. And likewise in international politics: A world where the Great Powers refuse to take sides and only demand immediate truces is a great world for aggressors and a terrible place for the aggressed. But, of course, it is a very convenient world in which to be a Great Power or a school principal.

  So part of this behavior can be chalked up to sheer selfishness on the part of the Wise.

  But part of it also has to do with signaling a superior vantage point. After all—what would the other adults think of a principal who actually seemed to be taking sides in a fight between mere children? Why, it would lower the principal’s status to a mere participant in the fray!

  Similarly with the revered elder—who might be a CEO, a prestigious academic, or a founder of a mailing list—whose reputation for fairness depends on their refusal to pass judgment themselves, when others are choosing sides. Sides appeal to them for support, but almost always in vain; for the Wise are revered judges on the condition that they almost never actually judge—then they would just be another disputant in the fray, no better than any other mere arguer.

  (Oddly, judges in the actual legal system can repeatedly hand down real verdicts without automatically losing their reputation for impartiality. Maybe because of the understood norm that they have to judge, that it’s their job. Or maybe because judges don’t have to repeatedly rule on issues that have split a tribe on which they depend for their reverence.)

  There are cases where it is rational to suspend judgment, where people leap to judgment only because of their biases. As Michael Rooney said:

  The error here is similar to one I see all the time in beginning philosophy students: when confronted with reasons to be skeptics, they instead become relativists. That is, when the rational conclusion is to suspend judgment about an issue, all too many people instead conclude that any judgment is as plausible as any other.

  But then how can we avoid the (related but distinct) pseudo-rationalist behavior of signaling your unbiased impartiality by falsely claiming that the current balance of evidence is neutral? “Oh, well, of course you have a lot of passionate Darwinists out there, but I think the evidence we
have doesn’t really enable us to make a definite endorsement of natural selection over intelligent design.”

  On this point I’d advise remembering that neutrality is a definite judgment. It is not staying above anything. It is putting forth the definite and particular position that the balance of evidence in a particular case licenses only one summation, which happens to be neutral. This, too, can be wrong; propounding neutrality is just as attackable as propounding any particular side.

  Likewise with policy questions. If someone says that both pro-life and pro-choice sides have good points and that they really should try to compromise and respect each other more, they are not taking a position above the two standard sides in the abortion debate. They are putting forth a definite judgment, every bit as particular as saying “pro-life!” or “pro-choice!”

  If your goal is to improve your general ability to form more accurate beliefs, it might be useful to avoid focusing on emotionally charged issues like abortion or the Israeli-Palestinian conflict. But it’s not that a rationalist is too mature to talk about politics. It’s not that a rationalist is above this foolish fray in which only mere political partisans and youthful enthusiasts would stoop to participate.

  As Robin Hanson describes it, the ability to have potentially divisive conversations is a limited resource. If you can think of ways to pull the rope sideways, you are justified in expending your limited resources on relatively less common issues where marginal discussion offers relatively higher marginal payoffs.

 

‹ Prev