Book Read Free

Rationality- From AI to Zombies

Page 28

by Eliezer Yudkowsky


  You can think of people who fit this description, right? People with high g-factor who end up being less effective because they are too sophisticated as arguers? Do you think you’d be helping them—making them more effective rationalists—if you just told them about a list of classic biases?

  I recall someone who learned about the calibration/overconfidence problem. Soon after he said: “Well, you can’t trust experts; they’re wrong so often—as experiments have shown. So therefore, when I predict the future, I prefer to assume that things will continue historically as they have—” and went off into this whole complex, error-prone, highly questionable extrapolation. Somehow, when it came to trusting his own preferred conclusions, all those biases and fallacies seemed much less salient—leapt much less readily to mind—than when he needed to counter-argue someone else.

  I told the one about the problem of disconfirmation bias and sophisticated argument, and lo and behold, the next time I said something he didn’t like, he accused me of being a sophisticated arguer. He didn’t try to point out any particular sophisticated argument, any particular flaw—just shook his head and sighed sadly over how I was apparently using my own intelligence to defeat itself. He had acquired yet another Fully General Counterargument.

  Even the notion of a “sophisticated arguer” can be deadly, if it leaps all too readily to mind when you encounter a seemingly intelligent person who says something you don’t like.

  I endeavor to learn from my mistakes. The last time I gave a talk on heuristics and biases, I started out by introducing the general concept by way of the conjunction fallacy and representativeness heuristic. And then I moved on to confirmation bias, disconfirmation bias, sophisticated argument, motivated skepticism, and other attitude effects. I spent the next thirty minutes hammering on that theme, reintroducing it from as many different perspectives as I could.

  I wanted to get my audience interested in the subject. Well, a simple description of conjunction fallacy and representativeness would suffice for that. But suppose they did get interested. Then what? The literature on bias is mostly cognitive psychology for cognitive psychology’s sake. I had to give my audience their dire warnings during that one lecture, or they probably wouldn’t hear them at all.

  Whether I do it on paper, or in speech, I now try to never mention calibration and overconfidence unless I have first talked about disconfirmation bias, motivated skepticism, sophisticated arguers, and dysrationalia in the mentally agile. First, do no harm!

  *

  1. Charles S. Taber and Milton Lodge, “Motivated Skepticism in the Evaluation of Political Beliefs,” American Journal of Political Science 50, no. 3 (2006): 755–769, doi:10.1111/j.1540-5907.2006.00214.x.

  68

  Update Yourself Incrementally

  Politics is the mind-killer. Debate is war, arguments are soldiers. There is the temptation to search for ways to interpret every possible experimental result to confirm your theory, like securing a citadel against every possible line of attack. This you cannot do. It is mathematically impossible. For every expectation of evidence, there is an equal and opposite expectation of counterevidence.

  But it’s okay if your cherished belief isn’t perfectly defended. If the hypothesis is that the coin comes up heads 95% of the time, then one time in twenty you will expect to see what looks like contrary evidence. This is okay. It’s normal. It’s even expected, so long as you’ve got nineteen supporting observations for every contrary one. A probabilistic model can take a hit or two, and still survive, so long as the hits don’t keep on coming in.

  Yet it is widely believed, especially in the court of public opinion, that a true theory can have no failures and a false theory no successes.

  You find people holding up a single piece of what they conceive to be evidence, and claiming that their theory can “explain” it, as though this were all the support that any theory needed. Apparently a false theory can have no supporting evidence; it is impossible for a false theory to fit even a single event. Thus, a single piece of confirming evidence is all that any theory needs.

  It is only slightly less foolish to hold up a single piece of probabilistic counterevidence as disproof, as though it were impossible for a correct theory to have even a slight argument against it. But this is how humans have argued for ages and ages, trying to defeat all enemy arguments, while denying the enemy even a single shred of support. People want their debates to be one-sided; they are accustomed to a world in which their preferred theories have not one iota of antisupport. Thus, allowing a single item of probabilistic counterevidence would be the end of the world.

  I just know someone in the audience out there is going to say, “But you can’t concede even a single point if you want to win debates in the real world! If you concede that any counterarguments exist, the Enemy will harp on them over and over—you can’t let the Enemy do that! You’ll lose! What could be more viscerally terrifying than that?”

  Whatever. Rationality is not for winning debates, it is for deciding which side to join. If you’ve already decided which side to argue for, the work of rationality is done within you, whether well or poorly. But how can you, yourself, decide which side to argue? If choosing the wrong side is viscerally terrifying, even just a little viscerally terrifying, you’d best integrate all the evidence.

  Rationality is not a walk, but a dance. On each step in that dance your foot should come down in exactly the correct spot, neither to the left nor to the right. Shifting belief upward with each iota of confirming evidence. Shifting belief downward with each iota of contrary evidence. Yes, down. Even with a correct model, if it is not an exact model, you will sometimes need to revise your belief down.

  If an iota or two of evidence happens to countersupport your belief, that’s okay. It happens, sometimes, with probabilistic evidence for non-exact theories. (If an exact theory fails, you are in trouble!) Just shift your belief downward a little—the probability, the odds ratio, or even a nonverbal weight of credence in your mind. Just shift downward a little, and wait for more evidence. If the theory is true, supporting evidence will come in shortly, and the probability will climb again. If the theory is false, you don’t really want it anyway.

  The problem with using black-and-white, binary, qualitative reasoning is that any single observation either destroys the theory or it does not. When not even a single contrary observation is allowed, it creates cognitive dissonance and has to be argued away. And this rules out incremental progress; it rules out correct integration of all the evidence. Reasoning probabilistically, we realize that on average, a correct theory will generate a greater weight of support than countersupport. And so you can, without fear, say to yourself: “This is gently contrary evidence, I will shift my belief downward.” Yes, down. It does not destroy your cherished theory. That is qualitative reasoning; think quantitatively.

  For every expectation of evidence, there is an equal and opposite expectation of counterevidence. On every occasion, you must, on average, anticipate revising your beliefs downward as much as you anticipate revising them upward. If you think you already know what evidence will come in, then you must already be fairly sure of your theory—probability close to 1—which doesn’t leave much room for the probability to go further upward. And however unlikely it seems that you will encounter disconfirming evidence, the resulting downward shift must be large enough to precisely balance the anticipated gain on the other side. The weighted mean of your expected posterior probability must equal your prior probability.

  How silly is it, then, to be terrified of revising your probability downward, if you’re bothering to investigate a matter at all? On average, you must anticipate as much downward shift as upward shift from every individual observation.

  It may perhaps happen that an iota of antisupport comes in again, and again and again, while new support is slow to trickle in. You may find your belief drifting downward and further downward. Until, finally, you realize from which quarter the winds of evidence are
blowing against you. In that moment of realization, there is no point in constructing excuses. In that moment of realization, you have already relinquished your cherished belief. Yay! Time to celebrate! Pop a champagne bottle or send out for pizza! You can’t become stronger by keeping the beliefs you started with, after all.

  *

  69

  One Argument Against An Army

  I talked about a style of reasoning in which not a single contrary argument is allowed, with the result that every non-supporting observation has to be argued away. Here I suggest that when people encounter a contrary argument, they prevent themselves from downshifting their confidence by rehearsing already-known support.

  Suppose the country of Freedonia is debating whether its neighbor, Sylvania, is responsible for a recent rash of meteor strikes on its cities. There are several pieces of evidence suggesting this: the meteors struck cities close to the Sylvanian border; there was unusual activity in the Sylvanian stock markets before the strikes; and the Sylvanian ambassador Trentino was heard muttering about “heavenly vengeance.”

  Someone comes to you and says: “I don’t think Sylvania is responsible for the meteor strikes. They have trade with us of billions of dinars annually.” “Well,” you reply, “the meteors struck cities close to Sylvania, there was suspicious activity in their stock market, and their ambassador spoke of heavenly vengeance afterward.” Since these three arguments outweigh the first, you keep your belief that Sylvania is responsible—you believe rather than disbelieve, qualitatively. Clearly, the balance of evidence weighs against Sylvania.

  Then another comes to you and says: “I don’t think Sylvania is responsible for the meteor strikes. Directing an asteroid strike is really hard. Sylvania doesn’t even have a space program.” You reply, “But the meteors struck cities close to Sylvania, and their investors knew it, and the ambassador came right out and admitted it!” Again, these three arguments outweigh the first (by three arguments against one argument), so you keep your belief that Sylvania is responsible.

  Indeed, your convictions are strengthened. On two separate occasions now, you have evaluated the balance of evidence, and both times the balance was tilted against Sylvania by a ratio of 3 to 1.

  You encounter further arguments by the pro-Sylvania traitors—again, and again, and a hundred times again—but each time the new argument is handily defeated by 3 to 1. And on every occasion, you feel yourself becoming more confident that Sylvania was indeed responsible, shifting your prior according to the felt balance of evidence.

  The problem, of course, is that by rehearsing arguments you already knew, you are double-counting the evidence. This would be a grave sin even if you double-counted all the evidence. (Imagine a scientist who does an experiment with 50 subjects and fails to obtain statistically significant results, so the scientist counts all the data twice.)

  But to selectively double-count only some evidence is sheer farce. I remember seeing a cartoon as a child, where a villain was dividing up loot using the following algorithm: “One for you, one for me. One for you, one-two for me. One for you, one-two-three for me.”

  As I emphasized in the last essay, even if a cherished belief is true, a rationalist may sometimes need to downshift the probability while integrating all the evidence. Yes, the balance of support may still favor your cherished belief. But you still have to shift the probability down—yes, down—from whatever it was before you heard the contrary evidence. It does no good to rehearse supporting arguments, because you have already taken those into account.

  And yet it does appear to me that when people are confronted by a new counterargument, they search for a justification not to downshift their confidence, and of course they find supporting arguments they already know. I have to keep constant vigilance not to do this myself! It feels as natural as parrying a sword-strike with a handy shield.

  With the right kind of wrong reasoning, a handful of support—or even a single argument—can stand off an army of contradictions.

  *

  70

  The Bottom Line

  There are two sealed boxes up for auction, box A and box B. One and only one of these boxes contains a valuable diamond. There are all manner of signs and portents indicating whether a box contains a diamond; but I have no sign which I know to be perfectly reliable. There is a blue stamp on one box, for example, and I know that boxes which contain diamonds are more likely than empty boxes to show a blue stamp. Or one box has a shiny surface, and I have a suspicion—I am not sure—that no diamond-containing box is ever shiny.

  Now suppose there is a clever arguer, holding a sheet of paper, and they say to the owners of box A and box B: “Bid for my services, and whoever wins my services, I shall argue that their box contains the diamond, so that the box will receive a higher price.” So the box-owners bid, and box B’s owner bids higher, winning the services of the clever arguer.

  The clever arguer begins to organize their thoughts. First, they write, “And therefore, box B contains the diamond!” at the bottom of their sheet of paper. Then, at the top of the paper, the clever arguer writes, “Box B shows a blue stamp,” and beneath it, “Box A is shiny,” and then, “Box B is lighter than box A,” and so on through many signs and portents; yet the clever arguer neglects all those signs which might argue in favor of box A. And then the clever arguer comes to me and recites from their sheet of paper: “Box B shows a blue stamp, and box A is shiny,” and so on, until they reach: “and therefore, box B contains the diamond.”

  But consider: At the moment when the clever arguer wrote down their conclusion, at the moment they put ink on their sheet of paper, the evidential entanglement of that physical ink with the physical boxes became fixed.

  It may help to visualize a collection of worlds—Everett branches or Tegmark duplicates—within which there is some objective frequency at which box A or box B contains a diamond. There’s likewise some objective frequency within the subset “worlds with a shiny box A” where box B contains the diamond; and some objective frequency in “worlds with shiny box A and blue-stamped box B” where box B contains the diamond.

  The ink on paper is formed into odd shapes and curves, which look like this text: “And therefore, box B contains the diamond.” If you happened to be a literate English speaker, you might become confused, and think that this shaped ink somehow meant that box B contained the diamond. Subjects instructed to say the color of printed pictures and shown the picture GREEN often say “green” instead of “red.” It helps to be illiterate, so that you are not confused by the shape of the ink.

  To us, the true import of a thing is its entanglement with other things. Consider again the collection of worlds, Everett branches or Tegmark duplicates. At the moment when all clever arguers in all worlds put ink to the bottom line of their paper—let us suppose this is a single moment—it fixed the correlation of the ink with the boxes. The clever arguer writes in non-erasable pen; the ink will not change. The boxes will not change. Within the subset of worlds where the ink says “And therefore, box B contains the diamond,” there is already some fixed percentage of worlds where box A contains the diamond. This will not change regardless of what is written in on the blank lines above.

  So the evidential entanglement of the ink is fixed, and I leave to you to decide what it might be. Perhaps box owners who believe a better case can be made for them are more liable to hire advertisers; perhaps box owners who fear their own deficiencies bid higher. If the box owners do not themselves understand the signs and portents, then the ink will be completely unentangled with the boxes’ contents, though it may tell you something about the owners’ finances and bidding habits.

  Now suppose another person present is genuinely curious, and they first write down all the distinguishing signs of both boxes on a sheet of paper, and then apply their knowledge and the laws of probability and write down at the bottom: “Therefore, I estimate an 85% probability that box B contains the diamond.” Of what is this handwriting evidence? E
xamining the chain of cause and effect leading to this physical ink on physical paper, I find that the chain of causality wends its way through all the signs and portents of the boxes, and is dependent on these signs; for in worlds with different portents, a different probability is written at the bottom.

  So the handwriting of the curious inquirer is entangled with the signs and portents and the contents of the boxes, whereas the handwriting of the clever arguer is evidence only of which owner paid the higher bid. There is a great difference in the indications of ink, though one who foolishly read aloud the ink-shapes might think the English words sounded similar.

  Your effectiveness as a rationalist is determined by whichever algorithm actually writes the bottom line of your thoughts. If your car makes metallic squealing noises when you brake, and you aren’t willing to face up to the financial cost of getting your brakes replaced, you can decide to look for reasons why your car might not need fixing. But the actual percentage of you that survive in Everett branches or Tegmark worlds—which we will take to describe your effectiveness as a rationalist—is determined by the algorithm that decided which conclusion you would seek arguments for. In this case, the real algorithm is “Never repair anything expensive.” If this is a good algorithm, fine; if this is a bad algorithm, oh well. The arguments you write afterward, above the bottom line, will not change anything either way.

  This is intended as a caution for your own thinking, not a Fully General Counterargument against conclusions you don’t like. For it is indeed a clever argument to say “My opponent is a clever arguer,” if you are paying yourself to retain whatever beliefs you had at the start. The world’s cleverest arguer may point out that the Sun is shining, and yet it is still probably daytime.

 

‹ Prev