Book Read Free

Rationality- From AI to Zombies

Page 132

by Eliezer Yudkowsky


  Anyone who knows a little prospect theory will have no trouble constructing cases where people say they would prefer to play gamble A rather than gamble B; but when you ask them to price the gambles they put a higher value on gamble B than gamble A. There are different perceptual features that become salient when you ask “Which do you prefer?” in a direct comparison, and “How much would you pay?” with a single item.

  This choice of gambles typically generates a preference reversal:

  1/3 chance to win $16 and 2/3 chance to lose $2.

  99/100 chance to win $4 and 1/100 chance to lose $1.

  Most people will rather play 2 than 1. But if you ask them to price the bets separately—ask for a price at which they would be indifferent between having that amount of money, and having a chance to play the gamble—people will put a higher price on 1 than on 2.1

  So first you sell them a chance to play bet 1, at their stated price. Then you offer to trade bet 1 for bet 2. Then you buy bet 2 back from them, at their stated price. Then you do it again. Hence the phrase, “money pump.”

  Or to paraphrase Steve Omohundro: If you would rather be in Oakland than San Francisco, and you would rather be in San Jose than Oakland, and you would rather be in San Francisco than San Jose, you’re going to spend an awful lot of money on taxi rides.

  Amazingly, people defend these preference patterns. Some subjects abandon them after the money-pump effect is pointed out—revise their price or revise their preference—but some subjects defend them.

  On one occasion, gamblers in Las Vegas played these kinds of bets for real money, using a roulette wheel. And afterward, one of the researchers tried to explain the problem with the incoherence between their pricing and their choices. From the transcript:2,3

  SARAH LICHTENSTEIN: “Well, how about the bid for Bet A? Do you have any further feelings about it now that you know you are choosing one but bidding more for the other one?”

  SUBJECT: “It’s kind of strange, but no, I don’t have any feelings at all whatsoever really about it. It’s just one of those things. It shows my reasoning process isn’t so good, but, other than that, I . . . no qualms.”

  . . .

  LICHTENSTEIN: “Can I persuade you that it is an irrational pattern?”

  SUBJECT: “No, I don’t think you probably could, but you could try.”

  . . .

  LICHTENSTEIN: “Well, now let me suggest what has been called a money-pump game and try this out on you and see how you like it. If you think Bet A is worth 550 points [points were converted to dollars after the game, though not on a one-to-one basis] then you ought to be willing to give me 550 points if I give you the bet . . .”

  . . .

  LICHTENSTEIN: “So you have Bet A, and I say, ‘Oh, you’d rather have Bet B wouldn’t you?’”

  . . .

  SUBJECT: “I’m losing money.”

  LICHTENSTEIN: “I’ll buy Bet B from you. I’ll be generous; I’ll pay you more than 400 points. I’ll pay you 401 points. Are you willing to sell me Bet B for 401 points?”

  SUBJECT: “Well, certainly.”

  . . .

  LICHTENSTEIN: “I’m now ahead 149 points.”

  SUBJECT: “That’s good reasoning on my part. (laughs) How many times are we going to go through this?”

  . . .

  LICHTENSTEIN: “Well, I think I’ve pushed you as far as I know how to push you short of actually insulting you.”

  SUBJECT: “That’s right.”

  You want to scream, “Just give up already! Intuition isn’t always right!”

  And then there’s the business of the strange value that people attach to certainty. My books are packed up for the move, but I believe that one experiment showed that a shift from 100% probability to 99% probability weighed larger in people’s minds than a shift from 80% probability to 20% probability.

  The problem with attaching a huge extra value to certainty is that one time’s certainty is another time’s probability.

  In the last essay, I talked about the Allais Paradox:

  1A. $24,000, with certainty.

  1B. 33/34 chance of winning $27,000, and 1/34 chance of winning nothing.

  2A. 34% chance of winning $24,000, and 66% chance of winning nothing.

  2B. 33% chance of winning $27,000, and 67% chance of winning nothing.

  The naive preference pattern on the Allais Paradox is 1A > 1B and 2B > 2A. Then you will pay me to throw a switch from A to B because you’d rather have a 33% chance of winning $27,000 than a 34% chance of winning $24,000. Then a die roll eliminates a chunk of the probability mass. In both cases you had at least a 66% chance of winning nothing. This die roll eliminates that 66%. So now option B is a 33/34 chance of winning $27,000, but option A is a certainty of winning $24,000. Oh, glorious certainty! So you pay me to throw the switch back from B to A.

  Now, if I’ve told you in advance that I’m going to do all that, do you really want to pay me to throw the switch, and then pay me to throw it back? Or would you prefer to reconsider?

  Whenever you try to price a probability shift from 24% to 23% as being less important than a shift from ~1 to 99%—every time you try to make an increment of probability have more value when it’s near an end of the scale—you open yourself up to this kind of exploitation. I can always set up a chain of events that eliminates the probability mass, a bit at a time, until you’re left with “certainty” that flips your preferences. One time’s certainty is another time’s uncertainty, and if you insist on treating the distance from ~1 to 0.99 as special, I can cause you to invert your preferences over time and pump some money out of you.

  Can I persuade you, perhaps, that this is an irrational pattern?

  Surely, if you’ve been reading this book for a while, you realize that you—the very system and process that reads these very words—are a flawed piece of machinery. Your intuitions are not giving you direct, veridical information about good choices. If you don’t believe that, there are some gambling games I’d like to play with you.

  There are various other games you can also play with certainty effects. For example, if you offer someone a certainty of $400, or an 80% probability of $500 and a 20% probability of $300, they’ll usually take the $400. But if you ask people to imagine themselves $500 richer, and ask if they would prefer a certain loss of $100 or a 20% chance of losing $200, they’ll usually take the chance of losing $200.4 Same probability distribution over outcomes, different descriptions, different choices.

  Yes, Virginia, you really should try to multiply the utility of outcomes by their probability. You really should. Don’t be embarrassed to use clean math.

  In the Allais paradox, figure out whether 1 unit of the difference between getting $24,000 and getting nothing outweighs 33 units of the difference between getting $24,000 and $27,000. If it does, prefer 1A to 1B and 2A to 2B. If the 33 units outweigh the 1 unit, prefer 1B to 1A and 2B to 2A. As for calculating the utility of money, I would suggest using an approximation that assumes money is logarithmic in utility. If you’ve got plenty of money already, pick B. If $24,000 would double your existing assets, pick A. Case 2 or case 1, makes no difference. Oh, and be sure to assess the utility of total asset values—the utility of final outcome states of the world—not changes in assets, or you’ll end up inconsistent again.

  A number of commenters claimed that the preference pattern wasn’t irrational because of “the utility of certainty,” or something like that. One commenter even wrote U(Certainty) into an expected utility equation.

  Does anyone remember that whole business about expected utility and utility being of fundamentally different types? Utilities are over outcomes. They are values you attach to particular, solid states of the world. You cannot feed a probability of 1 into a utility function. It makes no sense.

  And before you sniff, “Hmph . . . you just want the math to be neat and tidy,” remember that, in this case, the price of departing the Bayesian Way was paying someone to throw a switch and then
throw it back.

  But what about that solid, warm feeling of reassurance? Isn’t that a utility?

  That’s being human. Humans are not expected utility maximizers. Whether you want to relax and have fun, or pay some extra money for a feeling of certainty, depends on whether you care more about satisfying your intuitions or actually achieving the goal.

  If you’re gambling at Las Vegas for fun, then by all means, don’t think about the expected utility—you’re going to lose money anyway.

  But what if it were 24,000 lives at stake, instead of $24,000? The certainty effect is even stronger over human lives. Will you pay one human life to throw the switch, and another to switch it back?

  Tolerating preference reversals makes a mockery of claims to optimization. If you drive from San Jose to San Francisco to Oakland to San Jose, over and over again, then you may get a lot of warm fuzzy feelings out of it, but you can’t be interpreted as having a destination—as trying to go somewhere.

  When you have circular preferences, you’re not steering the future—just running in circles. If you enjoy running for its own sake, then fine. But if you have a goal—something you’re trying to actually accomplish—a preference reversal reveals a big problem. At least one of the choices you’re making must not be working to actually optimize the future in any coherent sense.

  If what you care about is the warm fuzzy feeling of certainty, then fine. If someone’s life is at stake, then you had best realize that your intuitions are a greasy lens through which to see the world. Your feelings are not providing you with direct, veridical information about strategic consequences—it feels that way, but they’re not. Warm fuzzies can lead you far astray.

  There are mathematical laws governing efficient strategies for steering the future. When something truly important is at stake—something more important than your feelings of happiness about the decision—then you should care about the math, if you truly care at all.

  *

  1. Sarah Lichtenstein and Paul Slovic, “Reversals of Preference Between Bids and Choices in Gambing Decisions,” Journal of Experimental Psychology 89, no. 1 (1971): 46–55.

  2. William Poundstone, Priceless: The Myth of Fair Value (and How to Take Advantage of It) (Hill & Wang, 2010).

  3. Sarah Lichtenstein and Paul Slovic, eds., The Construction of Preference (Cambridge University Press, 2006).

  4. Kahneman and Tversky, “Prospect Theory: An Analysis of Decision Under Risk.”

  285

  Feeling Moral

  Suppose that a disease, or a monster, or a war, or something, is killing people. And suppose you only have enough resources to implement one of the following two options:

  Save 400 lives, with certainty.

  Save 500 lives, with 90% probability; save no lives, 10% probability.

  Most people choose option 1. Which, I think, is foolish; because if you multiply 500 lives by 90% probability, you get an expected value of 450 lives, which exceeds the 400-life value of option 1. (Lives saved don’t diminish in marginal utility, so this is an appropriate calculation.)

  “What!” you cry, incensed. “How can you gamble with human lives? How can you think about numbers when so much is at stake? What if that 10% probability strikes, and everyone dies? So much for your damned logic! You’re following your rationality off a cliff!”

  Ah, but here’s the interesting thing. If you present the options this way:

  100 people die, with certainty.

  90% chance no one dies; 10% chance 500 people die.

  Then a majority choose option 2. Even though it’s the same gamble. You see, just as a certainty of saving 400 lives seems to feel so much more comfortable than an unsure gain, so too, a certain loss feels worse than an uncertain one.

  You can grandstand on the second description too: “How can you condemn 100 people to certain death when there’s such a good chance you can save them? We’ll all share the risk! Even if it was only a 75% chance of saving everyone, it would still be worth it—so long as there’s a chance—everyone makes it, or no one does!”

  You know what? This isn’t about your feelings. A human life, with all its joys and all its pains, adding up over the course of decades, is worth far more than your brain’s feelings of comfort or discomfort with a plan. Does computing the expected utility feel too cold-blooded for your taste? Well, that feeling isn’t even a feather in the scales, when a life is at stake. Just shut up and multiply.

  A googol is 10100—a 1 followed by one hundred zeroes. A googolplex is an even more incomprehensibly large number—it’s 10googol, a 1 followed by a googol zeroes. Now pick some trivial inconvenience, like a hiccup, and some decidedly untrivial misfortune, like getting slowly torn limb from limb by sadistic mutant sharks. If we’re forced into a choice between either preventing a googolplex people’s hiccups, or preventing a single person’s shark attack, which choice should we make? If you assign any negative value to hiccups, then, on pain of decision-theoretic incoherence, there must be some number of hiccups that would add up to rival the negative value of a shark attack. For any particular finite evil, there must be some number of hiccups that would be even worse.

  Moral dilemmas like these aren’t conceptual blood sports for keeping analytic philosophers entertained at dinner parties. They’re distilled versions of the kinds of situations we actually find ourselves in every day. Should I spend $50 on a console game, or give it all to charity? Should I organize a $700,000 fundraiser to pay for a single bone marrow transplant, or should I use that same money on mosquito nets and prevent the malaria deaths of some 200 children?

  Yet there are many who avert their gaze from the real world’s abundance of unpleasant moral tradeoffs—many, too, who take pride in looking away. Research shows that people distinguish “sacred values,” like human lives, from “unsacred values,” like money. When you try to trade off a sacred value against an unsacred value, subjects express great indignation. (Sometimes they want to punish the person who made the suggestion.)

  My favorite anecdote along these lines comes from a team of researchers who evaluated the effectiveness of a certain project, calculating the cost per life saved, and recommended to the government that the project be implemented because it was cost-effective. The governmental agency rejected the report because, they said, you couldn’t put a dollar value on human life. After rejecting the report, the agency decided not to implement the measure.

  Trading off a sacred value against an unsacred value feels really awful. To merely multiply utilities would be too cold-blooded—it would be following rationality off a cliff . . .

  But altruism isn’t the warm fuzzy feeling you get from being altruistic. If you’re doing it for the spiritual benefit, that is nothing but selfishness. The primary thing is to help others, whatever the means. So shut up and multiply!

  And if it seems to you that there is a fierceness to this maximization, like the bare sword of the law, or the burning of the Sun—if it seems to you that at the center of this rationality there is a small cold flame—

  Well, the other way might feel better inside you. But it wouldn’t work.

  And I say also this to you: That if you set aside your regret for all the spiritual satisfaction you could be having—if you wholeheartedly pursue the Way, without thinking that you are being cheated—if you give yourself over to rationality without holding back, you will find that rationality gives to you in return.

  But that part only works if you don’t go around saying to yourself, “It would feel better inside me if only I could be less rational.” Should you be sad that you have the opportunity to actually help people? You cannot attain your full potential if you regard your gift as a burden.

  *

  286

  The “Intuitions” Behind “Utilitarianism”

  I used to be very confused about metaethics. After my confusion finally cleared up, I did a postmortem on my previous thoughts. I found that my object-level moral reasoning had been valuable and my meta-lev
el moral reasoning had been worse than useless. And this appears to be a general syndrome—people do much better when discussing whether torture is good or bad than when they discuss the meaning of “good” and “bad.” Thus, I deem it prudent to keep moral discussions on the object level wherever I possibly can.

  Occasionally people object to any discussion of morality on the grounds that morality doesn’t exist, and in lieu of explaining that “exist” is not the right term to use here, I generally say, “But what do you do anyway?” and take the discussion back down to the object level.

  Paul Gowder, though, has pointed out that both the idea of choosing a googolplex trivial inconveniences over one atrocity, and the idea of “utilitarianism,” depend on “intuition.” He says I’ve argued that the two are not compatible, but charges me with failing to argue for the utilitarian intuitions that I appeal to.

  Now “intuition” is not how I would describe the computations that underlie human morality and distinguish us, as moralists, from an ideal philosopher of perfect emptiness and/or a rock. But I am okay with using the word “intuition” as a term of art, bearing in mind that “intuition” in this sense is not to be contrasted to reason, but is, rather, the cognitive building block out of which both long verbal arguments and fast perceptual arguments are constructed.

  I see the project of morality as a project of renormalizing intuition. We have intuitions about things that seem desirable or undesirable, intuitions about actions that are right or wrong, intuitions about how to resolve conflicting intuitions, intuitions about how to systematize specific intuitions into general principles.

  Delete all the intuitions, and you aren’t left with an ideal philosopher of perfect emptiness; you’re left with a rock.

  Keep all your specific intuitions and refuse to build upon the reflective ones, and you aren’t left with an ideal philosopher of perfect spontaneity and genuineness; you’re left with a grunting caveperson running in circles, due to cyclical preferences and similar inconsistencies.

 

‹ Prev