Rationality- From AI to Zombies

Page 143

by Eliezer Yudkowsky

I, too, used to say things like that, before I understood that Nature was allowed to kill me.

In that moment of realization, my childhood technophilia finally broke.

I finally understood that even if you diligently followed the rules of science and were a nice person, Nature could still kill you. I finally understood that even if you were the best project out of all available candidates, Nature could still kill you.

I understood that I was not being graded on a curve. My gaze shook free of rivals, and I saw the sheer blank wall.

I looked back and I saw the careful arguments I had constructed, for why the wisest choice was to continue forward at full speed, just as I had planned to do before. And I understood then that even if you constructed an argument showing that something was the best course of action, Nature was still allowed to say “So what?” and kill you.

I looked back and saw that I had claimed to take into account the risk of a fundamental mistake, that I had argued reasons to tolerate the risk of proceeding in the absence of full knowledge.

And I saw that the risk I wanted to tolerate would have killed me. And I saw that this possibility had never been really real to me. And I saw that even if you had wise and excellent arguments for taking a risk, the risk was still allowed to go ahead and kill you. Actually kill you.

For it is only the action that matters, and not the reasons for doing anything. If you build the gun and load the gun and put the gun to your head and pull the trigger, even with the cleverest of arguments for carrying out every step—then, bang.

I saw that only my own ignorance of the rules had enabled me to argue for going ahead without complete knowledge of the rules; for if you do not know the rules, you cannot model the penalty of ignorance.

I saw that others, still ignorant of the rules, were saying, “I will go ahead and do X”; and that to the extent that X was a coherent proposal at all, I knew that would result in a bang; but they said, “I do not know it cannot work.” I would try to explain to them the smallness of the target in the search space, and they would say “How can you be so sure I won’t win the lottery?,” wielding their own ignorance as a bludgeon.

And so I realized that the only thing I could have done to save myself, in my previous state of ignorance, was to say: “I will not proceed until I know positively that the ground is safe.” And there are many clever arguments for why you should step on a piece of ground that you don’t know to contain a landmine; but they all sound much less clever, after you look to the place that you proposed and intended to step, and see the bang.

I understood that you could do everything that you were supposed to do, and Nature was still allowed to kill you. That was when my last trust broke. And that was when my training as a rationalist began.

*

302

Beyond the Reach of God

This essay is a tad gloomier than usual, as I measure such things. It deals with a thought experiment I invented to smash my own optimism, after I realized that optimism had misled me. Those readers sympathetic to arguments like, “It’s important to keep our biases because they help us stay happy,” should consider not reading. (Unless they have something to protect, including their own life.)

So! Looking back on the magnitude of my own folly, I realized that at the root of it had been a disbelief in the Future’s vulnerability—a reluctance to accept that things could really turn out wrong. Not as the result of any explicit propositional verbal belief. More like something inside that persisted in believing, even in the face of adversity, that everything would be all right in the end.

Some would account this a virtue (zettai daijobu da yo), and others would say that it’s a thing necessary for mental health.

But we don’t live in that world. We live in the world beyond the reach of God.

It’s been a long, long time since I believed in God. Growing up in an Orthodox Jewish family, I can recall the last remembered time I asked God for something, though I don’t remember how old I was. I was putting in some request on behalf of the next-door-neighboring boy, I forget what exactly—something along the lines of, “I hope things turn out all right for him,” or maybe, “I hope he becomes Jewish.”

I remember what it was like to have some higher authority to appeal to, to take care of things I couldn’t handle myself. I didn’t think of it as “warm,” because I had no alternative to compare it to. I just took it for granted.

Still I recall, though only from distant childhood, what it’s like to live in the conceptually impossible possible world where God exists. Really exists, in the way that children and rationalists take all their beliefs at face value.

In the world where God exists, does God intervene to optimize everything? Regardless of what rabbis assert about the fundamental nature of reality, the take-it-seriously operational answer to this question is obviously “No.” You can’t ask God to bring you a lemonade from the refrigerator instead of getting one yourself. When I believed in God after the serious fashion of a child, so very long ago, I didn’t believe that.

Postulating that particular divine inaction doesn’t provoke a full-blown theological crisis. If you said to me, “I have constructed a benevolent superintelligent nanotech-user,” and I said “Give me a banana,” and no banana appeared, this would not yet disprove your statement. Human parents don’t always do everything their children ask. There are some decent fun-theoretic arguments—I even believe them myself—against the idea that the best kind of help you can offer someone is to always immediately give them everything they want. I don’t think that eudaimonia is formulating goals and having them instantly fulfilled; I don’t want to become a simple wanting-thing that never has to plan or act or think.

So it’s not necessarily an attempt to avoid falsification to say that God does not grant all prayers. Even a Friendly AI might not respond to every request.

But clearly there exists some threshold of horror awful enough that God will intervene. I remember that being true, when I believed after the fashion of a child.

The God who does not intervene at all, no matter how bad things get—that’s an obvious attempt to avoid falsification, to protect a belief-in-belief. Sufficiently young children don’t have the deep-down knowledge that God doesn’t really exist. They really expect to see a dragon in their garage. They have no reason to imagine a loving God who never acts. Where exactly is the boundary of sufficient awfulness? Even a child can imagine arguing over the precise threshold. But of course God will draw the line somewhere. Few indeed are the loving parents who, desiring their child to grow up strong and self-reliant, would let their toddler be run over by a car.

The obvious example of a horror so great that God cannot tolerate it is death—true death, mind-annihilation. I don’t think that even Buddhism allows that. So long as there is a God in the classic sense—full-blown, ontologically fundamental, the God—we can rest assured that no sufficiently awful event will ever, ever happen. There is no soul anywhere that need fear true annihilation; God will prevent it.

What if you build your own simulated universe? The classic example of a simulated universe is Conway’s Game of Life. I do urge you to investigate Life if you’ve never played it—it’s important for comprehending the notion of “physical law.” Conway’s Life has been proven Turing-complete, so it would be possible to build a sentient being in the Life universe, although it might be rather fragile and awkward. Other cellular automata would make it simpler.

Could you, by creating a simulated universe, escape the reach of God? Could you simulate a Game of Life containing sentient entities, and torture the beings therein? But if God is watching everywhere, then trying to build an unfair Life just results in the God stepping in to modify your computer’s transistors. If the physics you set up in your computer program calls for a sentient Life-entity to be endlessly tortured for no particular reason, the God will intervene. God being omnipresent, there is no refuge anywhere for true horror. Life is fair.

&n
bsp; But suppose that instead you ask the question:

Given such-and-such initial conditions, and given such-and-such cellular automaton rules, what would be the mathematical result?

Not even God can modify the answer to this question, unless you believe that God can implement logical impossibilities. Even as a very young child, I don’t remember believing that. (And why would you need to believe it, if God can modify anything that actually exists?)

What does Life look like, in this imaginary world where every step follows only from its immediate predecessor? Where things only ever happen, or don’t happen, because of the cellular automaton rules? Where the initial conditions and rules don’t describe any God that checks over each state? What does it look like, the world beyond the reach of God?

That world wouldn’t be fair. If the initial state contained the seeds of something that could self-replicate, natural selection might or might not take place, and complex life might or might not evolve, and that life might or might not become sentient, with no God to guide the evolution. That world might evolve the equivalent of conscious cows, or conscious dolphins, that lacked hands to improve their condition; maybe they would be eaten by conscious wolves who never thought that they were doing wrong, or cared.

If in a vast plethora of worlds, something like humans evolved, then they would suffer from diseases—not to teach them any lessons, but only because viruses happened to evolve as well, under the cellular automaton rules.

If the people of that world are happy, or unhappy, the causes of their happiness or unhappiness may have nothing to do with good or bad choices they made. Nothing to do with free will or lessons learned. In the what-if world where every step follows only from the cellular automaton rules, the equivalent of Genghis Khan can murder a million people, and laugh, and be rich, and never be punished, and live his life much happier than the average. Who prevents it? God would prevent it from ever actually happening, of course; He would at the very least visit some shade of gloom in the Khan’s heart. But in the mathematical answer to the question What if? there is no God in the axioms. So if the cellular automaton rules say that the Khan is happy, that, simply, is the whole and only answer to the what-if question. There is nothing, absolutely nothing, to prevent it.

And if the Khan tortures people horribly to death over the course of days, for his own amusement perhaps? They will call out for help, perhaps imagining a God. And if you really wrote that cellular automaton, God would intervene in your program, of course. But in the what-if question, what the cellular automaton would do under the mathematical rules, there isn’t any God in the system. Since the physical laws contain no specification of a utility function—in particular, no prohibition against torture—then the victims will be saved only if the right cells happen to be 0 or 1. And it’s not likely that anyone will defy the Khan; if they did, someone would strike them with a sword, and the sword would disrupt their organs and they would die, and that would be the end of that. So the victims die, screaming, and no one helps them; that is the answer to the what-if question.

Could the victims be completely innocent? Why not, in the what-if world? If you look at the rules for Conway’s Game of Life (which is Turing-complete, so we can embed arbitrary computable physics in there), then the rules are really very simple. Cells with three living neighbors stay alive; cells with two neighbors stay the same; all other cells die. There isn’t anything in there about innocent people not being horribly tortured for indefinite periods.

Is this world starting to sound familiar?

Belief in a fair universe often manifests in more subtle ways than thinking that horrors should be outright prohibited: Would the twentieth century have gone differently, if Klara Pölzl and Alois Hitler had made love one hour earlier, and a different sperm fertilized the egg, on the night that Adolf Hitler was conceived?

For so many lives and so much loss to turn on a single event seems disproportionate. The Divine Plan ought to make more sense than that. You can believe in a Divine Plan without believing in God—Karl Marx surely did. You shouldn’t have millions of lives depending on a casual choice, an hour’s timing, the speed of a microscopic flagellum. It ought not to be allowed. It’s too disproportionate. Therefore, if Adolf Hitler had been able to go to high school and become an architect, there would have been someone else to take his role, and World War II would have happened the same as before.

But in the world beyond the reach of God, there isn’t any clause in the physical axioms that says “things have to make sense” or “big effects need big causes” or “history runs on reasons too important to be so fragile.” There is no God to impose that order, which is so severely violated by having the lives and deaths of millions depend on one small molecular event.

The point of the thought experiment is to lay out the God-universe and the Nature-universe side by side, so that we can recognize what kind of thinking belongs to the God-universe. Many who are atheists still think as if certain things are not allowed. They would lay out arguments for why World War II was inevitable and would have happened in more or less the same way, even if Hitler had become an architect. But in sober historical fact, this is an unreasonable belief; I chose the example of World War II because from my reading, it seems that events were mostly driven by Hitler’s personality, often in defiance of his generals and advisors. There is no particular empirical justification that I happen to have heard of for doubting this. The main reason to doubt would be refusal to accept that the universe could make so little sense—that horrible things could happen so lightly, for no more reason than a roll of the dice.

But why not? What prohibits it?

In the God-universe, God prohibits it. To recognize this is to recognize that we don’t live in that universe. We live in the what-if universe beyond the reach of God, driven by the mathematical laws and nothing else. Whatever physics says will happen, will happen. Absolutely anything, good or bad, will happen. And there is nothing in the laws of physics to lift this rule even for the really extreme cases, where you might expect Nature to be a little more reasonable.

Reading William Shirer’s The Rise and Fall of the Third Reich, listening to him describe the disbelief that he and others felt upon discovering the full scope of Nazi atrocities, I thought of what a strange thing it was, to read all that, and know, already, that there wasn’t a single protection against it. To just read through the whole book and accept it; horrified, but not at all disbelieving, because I’d already understood what kind of world I lived in.

Once upon a time, I believed that the extinction of humanity was not allowed. And others who call themselves rationalists may yet have things they trust. They might be called “positive-sum games,” or “democracy,” or “technology,” but they are sacred. The mark of this sacredness is that the trustworthy thing can’t lead to anything really bad; or they can’t be permanently defaced, at least not without a compensatory silver lining. In that sense they can be trusted, even if a few bad things happen here and there.

The unfolding history of Earth can’t ever turn from its positive-sum trend to a negative-sum trend; that is not allowed. Democracies—modern liberal democracies, anyway—won’t ever legalize torture. Technology has done so much good up until now, that there can’t possibly be a Black Swan technology that breaks the trend and does more harm than all the good up until this point.

There are all sorts of clever arguments why such things can’t possibly happen. But the source of these arguments is a much deeper belief that such things are not allowed. Yet who prohibits? Who prevents it from happening? If you can’t visualize at least one lawful universe where physics say that such dreadful things happen—and so they do happen, there being nowhere to appeal the verdict—then you aren’t yet ready to argue probabilities.

Could it really be that sentient beings have died absolutely for thousands or millions of years, with no soul and no afterlife—and not as part of any grand plan of Nature—not to teach any great lesson about the meaningf
ulness or meaninglessness of life—not even to teach any profound lesson about what is impossible—so that a trick as simple and stupid-sounding as vitrifying people in liquid nitrogen can save them from total annihilation—and a 10-second rejection of the silly idea can destroy someone’s soul? Can it be that a computer programmer who signs a few papers and buys a life-insurance policy continues into the far future, while Einstein rots in a grave? We can be sure of one thing: God wouldn’t allow it. Anything that ridiculous and disproportionate would be ruled out. It would make a mockery of the Divine Plan—a mockery of the strong reasons why things must be the way they are.

You can have secular rationalizations for things being not allowed. So it helps to imagine that there is a God, benevolent as you understand goodness—a God who enforces throughout Reality a minimum of fairness and justice—whose plans make sense and depend proportionally on people’s choices—who will never permit absolute horror—who does not always intervene, but who at least prohibits universes wrenched completely off their track . . . to imagine all this, but also imagine that you, yourself, live in a what-if world of pure mathematics—a world beyond the reach of God, an utterly unprotected world where anything at all can happen.

If there’s any reader still reading this who thinks that being happy counts for more than anything in life, then maybe they shouldn’t spend much time pondering the unprotectedness of their existence. Maybe think of it just long enough to sign up themselves and their family for cryonics, and/or write a check to an existential-risk-mitigation agency now and then. And wear a seat belt and get health insurance and all those other dreary necessary things that can destroy your life if you miss that one step . . . but aside from that, if you want to be happy, meditating on the fragility of life isn’t going to help.

‹ Prev Next ›