Rationality- From AI to Zombies

Page 10

by Eliezer Yudkowsky

I think it means that you have said the word “democracy,” so the audience is supposed to cheer. It’s not so much a propositional statement, as the equivalent of the “Applause” light that tells a studio audience when to clap.

This case is remarkable only in that I mistook the applause light for a policy suggestion, with subsequent embarrassment for all. Most applause lights are much more blatant, and can be detected by a simple reversal test. For example, suppose someone says:

We need to balance the risks and opportunities of AI.

If you reverse this statement, you get:

We shouldn’t balance the risks and opportunities of AI.

Since the reversal sounds abnormal, the unreversed statement is probably normal, implying it does not convey new information. There are plenty of legitimate reasons for uttering a sentence that would be uninformative in isolation. “We need to balance the risks and opportunities of AI” can introduce a discussion topic; it can emphasize the importance of a specific proposal for balancing; it can criticize an unbalanced proposal. Linking to a normal assertion can convey new information to a bounded rationalist—the link itself may not be obvious. But if no specifics follow, the sentence is probably an applause light.

I am tempted to give a talk sometime that consists of nothing but applause lights, and see how long it takes for the audience to start laughing:

I am here to propose to you today that we need to balance the risks and opportunities of advanced Artificial Intelligence. We should avoid the risks and, insofar as it is possible, realize the opportunities. We should not needlessly confront entirely unnecessary dangers. To achieve these goals, we must plan wisely and rationally. We should not act in fear and panic, or give in to technophobia; but neither should we act in blind enthusiasm. We should respect the interests of all parties with a stake in the Singularity. We must try to ensure that the benefits of advanced technologies accrue to as many individuals as possible, rather than being restricted to a few. We must try to avoid, as much as possible, violent conflicts using these technologies; and we must prevent massive destructive capability from falling into the hands of individuals. We should think through these issues before, not after, it is too late to do anything about them . . .

*

Part C

Noticing Confusion

20

Focus Your Uncertainty

Will bond yields go up, or down, or remain the same? If you’re a TV pundit and your job is to explain the outcome after the fact, then there’s no reason to worry. No matter which of the three possibilities comes true, you’ll be able to explain why the outcome perfectly fits your pet market theory. There’s no reason to think of these three possibilities as somehow opposed to one another, as exclusive, because you’ll get full marks for punditry no matter which outcome occurs.

But wait! Suppose you’re a novice TV pundit, and you aren’t experienced enough to make up plausible explanations on the spot. You need to prepare remarks in advance for tomorrow’s broadcast, and you have limited time to prepare. In this case, it would be helpful to know which outcome will actually occur—whether bond yields will go up, down, or remain the same—because then you would only need to prepare one set of excuses.

Alas, no one can possibly foresee the future. What are you to do? You certainly can’t use “probabilities.” We all know from school that “probabilities” are little numbers that appear next to a word problem, and there aren’t any little numbers here. Worse, you feel uncertain. You don’t remember feeling uncertain while you were manipulating the little numbers in word problems. College classes teaching math are nice clean places, therefore math itself can’t apply to life situations that aren’t nice and clean. You wouldn’t want to inappropriately transfer thinking skills from one context to another. Clearly, this is not a matter for “probabilities.”

Nonetheless, you only have 100 minutes to prepare your excuses. You can’t spend the entire 100 minutes on “up,” and also spend all 100 minutes on “down,” and also spend all 100 minutes on “same.” You’ve got to prioritize somehow.

If you needed to justify your time expenditure to a review committee, you would have to spend equal time on each possibility. Since there are no little numbers written down, you’d have no documentation to justify spending different amounts of time. You can hear the reviewers now: And why, Mr. Finkledinger, did you spend exactly 42 minutes on excuse #3? Why not 41 minutes, or 43? Admit it—you’re not being objective! You’re playing subjective favorites!

But, you realize with a small flash of relief, there’s no review committee to scold you. This is good, because there’s a major Federal Reserve announcement tomorrow, and it seems unlikely that bond prices will remain the same. You don’t want to spend 33 precious minutes on an excuse you don’t anticipate needing.

Your mind keeps drifting to the explanations you use on television, of why each event plausibly fits your market theory. But it rapidly becomes clear that plausibility can’t help you here—all three events are plausible. Fittability to your pet market theory doesn’t tell you how to divide your time. There’s an uncrossable gap between your 100 minutes of time, which are conserved; versus your ability to explain how an outcome fits your theory, which is unlimited.

And yet . . . even in your uncertain state of mind, it seems that you anticipate the three events differently; that you expect to need some excuses more than others. And—this is the fascinating part—when you think of something that makes it seem more likely that bond prices will go up, then you feel less likely to need an excuse for bond prices going down or remaining the same.

It even seems like there’s a relation between how much you anticipate each of the three outcomes, and how much time you want to spend preparing each excuse. Of course the relation can’t actually be quantified. You have 100 minutes to prepare your speech, but there isn’t 100 of anything to divide up in this anticipation business. (Although you do work out that, if some particular outcome occurs, then your utility function is logarithmic in time spent preparing the excuse.)

Still . . . your mind keeps coming back to the idea that anticipation is limited, unlike excusability, but like time to prepare excuses. Maybe anticipation should be treated as a conserved resource, like money. Your first impulse is to try to get more anticipation, but you soon realize that, even if you get more anticiptaion, you won’t have any more time to prepare your excuses. No, your only course is to allocate your limited supply of anticipation as best you can.

You’re pretty sure you weren’t taught anything like that in your statistics courses. They didn’t tell you what to do when you felt so terribly uncertain. They didn’t tell you what to do when there were no little numbers handed to you. Why, even if you tried to use numbers, you might end up using any sort of numbers at all—there’s no hint what kind of math to use, if you should be using math! Maybe you’d end up using pairs of numbers, right and left numbers, which you’d call DS for Dexter-Sinister . . . or who knows what else? (Though you do have only 100 minutes to spend preparing excuses.)

If only there were an art of focusing your uncertainty—of squeezing as much anticipation as possible into whichever outcome will actually happen!

But what could we call an art like that? And what would the rules be like?

*

21

What Is Evidence?

The sentence “snow is white” is true if and only if snow is white.

—Alfred Tarski

To say of what is, that it is, or of what is not, that it is not, is true.

—Aristotle, Metaphysics IV

If these two quotes don’t seem like a sufficient definition of “truth,” skip ahead to The Simple Truth. Here I’m going to talk about “evidence.” (I also intend to discuss beliefs-of-fact, not emotions or morality, as distinguished in Feeling Rational.)

Walking along the street, your shoelaces come untied. Shortly thereafter, for some odd reason, you start believing your shoelaces are untied.
Light leaves the Sun and strikes your shoelaces and bounces off; some photons enter the pupils of your eyes and strike your retina; the energy of the photons triggers neural impulses; the neural impulses are transmitted to the visual-processing areas of the brain; and there the optical information is processed and reconstructed into a 3D model that is recognized as an untied shoelace. There is a sequence of events, a chain of cause and effect, within the world and your brain, by which you end up believing what you believe. The final outcome of the process is a state of mind which mirrors the state of your actual shoelaces.

What is evidence? It is an event entangled, by links of cause and effect, with whatever you want to know about. If the target of your inquiry is your shoelaces, for example, then the light entering your pupils is evidence entangled with your shoelaces. This should not be confused with the technical sense of “entanglement” used in physics—here I’m just talking about “entanglement” in the sense of two things that end up in correlated states because of the links of cause and effect between them.

Not every influence creates the kind of “entanglement” required for evidence. It’s no help to have a machine that beeps when you enter winning lottery numbers, if the machine also beeps when you enter losing lottery numbers. The light reflected from your shoes would not be useful evidence about your shoelaces, if the photons ended up in the same physical state whether your shoelaces were tied or untied.

To say it abstractly: For an event to be evidence about a target of inquiry, it has to happen differently in a way that’s entangled with the different possible states of the target. (To say it technically: There has to be Shannon mutual information between the evidential event and the target of inquiry, relative to your current state of uncertainty about both of them.)

Entanglement can be contagious when processed correctly, which is why you need eyes and a brain. If photons reflect off your shoelaces and hit a rock, the rock won’t change much. The rock won’t reflect the shoelaces in any helpful way; it won’t be detectably different depending on whether your shoelaces were tied or untied. This is why rocks are not useful witnesses in court. A photographic film will contract shoelace-entanglement from the incoming photons, so that the photo can itself act as evidence. If your eyes and brain work correctly, you will become tangled up with your own shoelaces.

This is why rationalists put such a heavy premium on the paradoxical-seeming claim that a belief is only really worthwhile if you could, in principle, be persuaded to believe otherwise. If your retina ended up in the same state regardless of what light entered it, you would be blind. Some belief systems, in a rather obvious trick to reinforce themselves, say that certain beliefs are only really worthwhile if you believe them unconditionally—no matter what you see, no matter what you think. Your brain is supposed to end up in the same state regardless. Hence the phrase, “blind faith.” If what you believe doesn’t depend on what you see, you’ve been blinded as effectively as by poking out your eyeballs.

If your eyes and brain work correctly, your beliefs will end up entangled with the facts. Rational thought produces beliefs which are themselves evidence.

If your tongue speaks truly, your rational beliefs, which are themselves evidence, can act as evidence for someone else. Entanglement can be transmitted through chains of cause and effect—and if you speak, and another hears, that too is cause and effect. When you say “My shoelaces are untied” over a cellphone, you’re sharing your entanglement with your shoelaces with a friend.

Therefore rational beliefs are contagious, among honest folk who believe each other to be honest. And it’s why a claim that your beliefs are not contagious—that you believe for private reasons which are not transmissible—is so suspicious. If your beliefs are entangled with reality, they should be contagious among honest folk.

If your model of reality suggests that the outputs of your thought processes should not be contagious to others, then your model says that your beliefs are not themselves evidence, meaning they are not entangled with reality. You should apply a reflective correction, and stop believing.

Indeed, if you feel, on a gut level, what this all means, you will automatically stop believing. Because “my belief is not entangled with reality” means “my belief is not accurate.” As soon as you stop believing “‘snow is white’ is true,” you should (automatically!) stop believing “snow is white,” or something is very wrong.

So go ahead and explain why the kind of thought processes you use systematically produce beliefs that mirror reality. Explain why you think you’re rational. Why you think that, using thought processes like the ones you use, minds will end up believing “snow is white” if and only if snow is white. If you don’t believe that the outputs of your thought processes are entangled with reality, why do you believe the outputs of your thought processes? It’s the same thing, or it should be.

*

22

Scientific Evidence, Legal Evidence, Rational Evidence

Suppose that your good friend, the police commissioner, tells you in strictest confidence that the crime kingpin of your city is Wulky Wilkinsen. As a rationalist, are you licensed to believe this statement? Put it this way: if you go ahead and insult Wulky, I’d call you foolhardy. Since it is prudent to act as if Wulky has a substantially higher-than-default probability of being a crime boss, the police commissioner’s statement must have been strong Bayesian evidence.

Our legal system will not imprison Wulky on the basis of the police commissioner’s statement. It is not admissible as legal evidence. Maybe if you locked up every person accused of being a crime boss by a police commissioner, you’d initially catch a lot of crime bosses, plus some people that a police commissioner didn’t like. Power tends to corrupt: over time, you’d catch fewer and fewer real crime bosses (who would go to greater lengths to ensure anonymity) and more and more innocent victims (unrestrained power attracts corruption like honey attracts flies).

This does not mean that the police commissioner’s statement is not rational evidence. It still has a lopsided likelihood ratio, and you’d still be a fool to insult Wulky. But on a social level, in pursuit of a social goal, we deliberately define “legal evidence” to include only particular kinds of evidence, such as the police commissioner’s own observations on the night of April 4th. All legal evidence should ideally be rational evidence, but not the other way around. We impose special, strong, additional standards before we anoint rational evidence as “legal evidence.”

As I write this sentence at 8:33 p.m., Pacific time, on August 18th, 2007, I am wearing white socks. As a rationalist, are you licensed to believe the previous statement? Yes. Could I testify to it in court? Yes. Is it a scientific statement? No, because there is no experiment you can perform yourself to verify it. Science is made up of generalizations which apply to many particular instances, so that you can run new real-world experiments which test the generalization, and thereby verify for yourself that the generalization is true, without having to trust anyone’s authority. Science is the publicly reproducible knowledge of humankind.

Like a court system, science as a social process is made up of fallible humans. We want a protected pool of beliefs that are especially reliable. And we want social rules that encourage the generation of such knowledge. So we impose special, strong, additional standards before we canonize rational knowledge as “scientific knowledge,” adding it to the protected belief pool. Is a rationalist licensed to believe in the historical existence of Alexander the Great? Yes. We have a rough picture of ancient Greece, untrustworthy but better than maximum entropy. But we are dependent on authorities such as Plutarch; we cannot discard Plutarch and verify everything for ourselves. Historical knowledge is not scientific knowledge.

Is a rationalist licensed to believe that the Sun will rise on September 18th, 2007? Yes—not with absolute certainty, but that’s the way to bet. (Pedants: interpret this as the Earth’s rotation and orbit remaining roughly constant relative to the Sun.) Is this statement,
as I write this essay on August 18th, 2007, a scientific belief?

It may seem perverse to deny the adjective “scientific” to statements like “The Sun will rise on September 18th, 2007.” If Science could not make predictions about future events—events which have not yet happened—then it would be useless; it could make no prediction in advance of experiment. The prediction that the Sun will rise is, definitely, an extrapolation from scientific generalizations. It is based upon models of the Solar System that you could test for yourself by experiment.

But imagine that you’re constructing an experiment to verify prediction #27, in a new context, of an accepted theory Q. You may not have any concrete reason to suspect the belief is wrong; you just want to test it in a new context. It seems dangerous to say, before running the experiment, that there is a “scientific belief” about the result. There is a “conventional prediction” or “theory Q’s prediction.” But if you already know the “scientific belief” about the result, why bother to run the experiment?

You begin to see, I hope, why I identify Science with generalizations, rather than the history of any one experiment. A historical event happens once; generalizations apply over many events. History is not reproducible; scientific generalizations are.

Is my definition of “scientific knowledge” true? That is not a well-formed question. The special standards we impose upon science are pragmatic choices. Nowhere upon the stars or the mountains is it written that p < 0.05 shall be the standard for scientific publication. Many now argue that 0.05 is too weak, and that it would be useful to lower it to 0.01 or 0.001.

‹ Prev Next ›