Chapters 4 and 5 reinforce a morality-tale reading of the evidence, with sharply etched good guys (the spry foxes) and bad guys (the self-assured hedgehogs). Chapter 6 calls on us to hear out the defense before reaching a final verdict. The defense raises logical objections to the factual, moral, and metaphysical assumptions underlying claims that “one group makes more accurate judgments than another” and demands difficulty, value, controversy and fuzzy-set scoring-rule adjustments as compensation. The defense also raises the psychological objection that there is no single, best cognitive style across situations.41 Overconfidence may be essential for achieving the forecasting coups that posterity hails as visionary. The bold but often wrong forecasts of hedgehogs may be as forgivable as high strikeout rates among home-run hitters, the product of a reasonable trade-off, not grounds for getting kicked off the team. Both sets of defenses create pockets of reasonable doubt but, in the end, neither can exonerate hedgehogs of all their transgressions. Hedgehogs just made too many mistakes spread across too many topics.
Whereas chapter 6 highlighted some benefits of the “closed-minded” hedgehog approach to the world, chapter 7 dwells on some surprising costs of the “open-minded” fox approach. Consultants in the business and political worlds often use scenario exercises to encourage decision makers to let down their guards and imagine a broader array of possibilities than they normally would.42 On the plus side, these exercises can check some forms of overconfidence, no mean achievement. On the minus side, these exercises can stimulate experts—once they start unpacking possible worlds—to assign too much likelihood to too many scenarios.43 There is nothing admirably open-minded about agreeing that the probability of event A is less than the compound probability of A and B, or that x is inevitable but alternatives to x remain possible. Trendy open-mindedness looks like old-fashioned confusion. And the open-minded foxes are more vulnerable to this confusion than the closed-minded hedgehogs.
We are left, then, with a murkier tale. The dominant danger remains hubris, the mostly hedgehog vice of closed-mindedness, of dismissing dissonant possibilities too quickly. But there is also the danger of cognitive chaos, the mostly fox vice of excessive open-mindedness, of seeing too much merit in too many stories. Good judgment now becomes a metacognitive skill—akin to “the art of self-overhearing.”44 Good judges need to eavesdrop on the mental conversations they have with themselves as they decide how to decide, and determine whether they approve of the trade-offs they are striking in the classic exploitation-exploration balancing act, that between exploiting existing knowledge and exploring new possibilities.
Chapter 8 reflects on the broader implications of this project. From a philosophy of science perspective, there is value in assessing how far an exercise of this sort can be taken. We failed to purge all subjectivity from judgments of good judgment, but we advanced the cause of “objectification” by developing valid correspondence and coherence measures of good judgment, by discovering links between how observers think and how they fare on these measures, and by determining the robustness of these links across scoring adjustments. From a policy perspective, there is value in using publicly verifiable correspondence and coherence benchmarks to gauge the quality of public debates. The more people know about pundits’ track records, the stronger the pundits’ incentives to compete by improving the epistemic (truth) value of their products, not just by pandering to communities of co-believers.
These are my principal arguments. Like any author, I hope they stand the test of time. I would not, however, view this project as a failure if hedgehogs swept every forecasting competition in the early twenty-first century. Indeed, this book gives reasons for expecting occasional reversals of this sort. This book will count as a failure, as a dead end, only if it fails to inspire follow-ups by those convinced they can do better.
1 For a passionate affirmation of these defenses, see W. Safire, “The New Groupthink,” New York Times, July 14, 2004, A27.
2 The characterization of human beings as rationalizing rather than rational animals is as old as Aristotle and as new as experimental social psychology. See Z. Kunda, Social Cognition: Making Sense of People (Boston: MIT Press, 1999).
3 I. Berlin, “The Hedgehog and the Fox,” in The Proper Study of Mankind (New York: Farrar, Straus & Giroux, 1997), 436–98. Berlin traces the distinction—via Erasmus—2,600 years to a shadowy source on the edge of recorded Greek history: the soldier-poet Archilocus. The metaphorical meaning oscillates over time, but it never strays far from eclectic cunning (foxes) and dogged persistence (hedgehogs).
4 Extreme relativism may be a mix of anthropological and epistemological posturing. But prominent scholars have advanced strong “incommensurability arguments” that claim clashing worldviews entail such different standards of evidence as to make mutual comprehension impossible. In philosophy of science: P. Feyerabend, Against Method: Outline of an Anarchistic Theory of Knowledge (London: Humanities Press, 1975). In moral theory, A. MacIntyre, Whose Justice? Which Rationality? (London: Duckworth, 1988). Such arguments carry strong implications for how to do research. We should adopt a nonjudgmental approach to judgment, one limited to compiling colorful ethnographic catalogs of the odd ideas that have prevailed at different times and places.
5 For excellent compilations, and analyses, of such arguments, see R. Jervis, Perception and Misperception in International Politics (Princeton, NJ: Princeton University Press, 1976); R. E. Neustadt and E. R. May, Thinking in Time (New York: Free Press, 1986); Y. Vertzberger, The World in Their Minds (Stanford, CA: Stanford University Press, 1990); Y. F. Khong, Analogies at War (Princeton, NJ: Princeton University Press, 1993); B. W. Jentleson, ed., Opportunities Missed, Opportunities Seized: Preventive Diplomacy in the Post–Cold War World (Lanham, MD: Rowman & Littlefield, 1999); F. I. Greenstein, The Presidential Difference: Leadership Styles from FDR to Clinton (New York: Free Press, 2000); D. W. Larson and S. A. Renshon, Good Judgment in Foreign Policy (Lanham, MD: Rowman & Littlefield, 2003).
6 D. McCullough, Truman (New York: Simon & Schuster, 1992); B. J. Bernstein, “The Atomic Bombing Reconsidered,” Foreign Affairs 74 (1995): 147.
7 D. Welch and J. Blight, “The Eleventh Hour of the Cuban Missile Crisis: An Introduction to the ExComm Tapes,” International Security 12 (1987/88): 5–92; S. Stern, “Source Material: The 1997 Published Transcripts of the JFK Cuban Missile Crisis Tapes: Too Good to be True?” Presidential Studies Quarterly 3 (1997): 586–93.
8 J. Matlock, Autopsy on an Empire: the American Ambassador’s Account of the Collapse of the Soviet Union (New York: Random House, 1995); B. Farnham, “Perceiving the End of Threat: Ronald Reagan and the Gorbachev Revolution,” in Good Judgment in Foreign Policy, 153–90. R. L. Garthoff, The Great Transition: American-Soviet Relations and the End of the Cold War (Washington, DC: Brookings Institution, 1994).
9 The debate on this case has only begun. But the 9/11 Presidential Commission has laid out a thoughtful framework for conducting it (The 9/11 Commission Report. New York: Norton, 2004).
10 On the fundamental status of correspondence and coherence standards in judging judgment, see K. Hammond, Human Judgment and Social Policy: Irreducible Uncertainty, Inevitable Error, Unavoidable Injustice (New York: Oxford University Press, 1996).
11 This project offers many examples of interlocking convergence: our hedgehog-fox measure of cognitive style predicts indicators of good judgment similar to those predicted by kindred measures elsewhere; our qualitative analysis of forecasters’ explanations for their predictions dovetails with our quantitative analyses of why foxes outperformed hedgehogs; our findings of poky belief updating among forecasters, especially hedgehogs, mesh well with laboratory research on “cognitive conservatism.” Psychologists will see here the cumulative logic of construct validation. See D. T. Campbell and D. W. Fiske, “Convergent and Discriminant Validation by the Multitrait-Multimethod Matrix,” Psychological Bulletin 56 (1959): 81–105.
12 I avoid ambitious conceptions of good judgment
that require, for instance, my judging how skillfully policy makers juggle trade-offs among decision quality (is this policy the best policy given our conception of national interest?), acceptability (can we sell this policy?), and timeliness (how should we factor in the costs of delay?). (A. L. George, Presidential Decision-Making in Foreign Policy [Boulder, CO: Westview, 1980]) I also steer clear of conceptions that require my judging whether decision makers grasped “the essential elements of a problem and their significance” or “considered the full range of viable options.” (S. Renshon, “Psychological Sources of Good Judgment in Political Leaders, in Good Judgment in Foreign Policy, 25–57).
13 My approach represents a sharp shift away from case-specific “idiographic” knowledge (who gets what right at specific times and places?) toward more generalizable or “nomothetic” knowledge (who tends to be right across times and places?). Readers hoping for the scoop on who was right about “shock therapy” or the “Mexican bailout” will be disappointed. Readers should stay tuned, though, if they are curious why some observers manage to assign consistently more realistic probabilities across topics.
14 The law of large numbers is a foundational principle of statistics, and Stigler traces it to the eighteenth century. He quotes Bernoulli: “For even the most stupid of men, by some instinct of nature … is convinced that the more observations have been made, the less danger there is of wandering from one’s goal.” And Poisson: “All manner of things are subject to a universal law that we may call the law of large numbers …: if we observe a large number of events of the same nature, dependent upon constant causes and upon causes that vary irregularly … we will find the ratios between the numbers of these events are approximately constant.” (S. Stigler, 1986, The History of Statistics: The Measurement of Uncertainty Before 1900 [Cambridge: Harvard University Press, 1986], 65, 185)
15 Our correspondence measures focused on the future, not the present or past, because we doubted that the sophisticated specialists in our sample would make the crude partisan errors of fact ordinary citizens make (see D. Green, B. Palmquist, and E. Schickler, Partisan Hearts and Minds [New Haven, CT: Yale University Press, 2002]). Pilot testing confirmed these doubts. Even the most dogmatic Democrats in our sample knew that inflation fell in the Reagan years, and even the most dogmatic Republicans knew that budget deficits shrank in the Clinton years. To capture susceptibility to biases among our respondents, we needed a more sophisticated mousetrap.
16 For thoughtful discussions of correspondence measures, see A. Kruglanski, Lay Epistemics and Human Knowledge (New York: Plenum Press, 1989; D. A. Kenny, Interpersonal Perception (New York: Guilford Press, 1994).
17 John Swets, Signal Detection Theory and ROC Analysis in Psychology and Diagnostics (Mahwah, NJ: Lawrence Erlbaum, 1996).
18 J. Swets, R. Dawes, and J. Monahan, “Psychological Science Can Improve Diagnostic Decisions, Psychological Science in the Public Interest, 1 (2000): 1–26. These mental exercises compel us to be uncomfortably explicit about our priorities. Should we give into the utilitarian temptation to save lives by ending a long war quickly via a tactical nuclear strike to “take out” the enemy leadership? Or should we define good judgment as the refusal to countenance taboo trade-offs, as the wise recognition that some things are best left unthinkable? See P. E. Tetlock, O. Kristel, B. Elson, M. Green, and J. Lerner, (2000). “The Psychology of the Unthinkable: Taboo Trade-Offs, Forbidden Base Rates, and Heretical Counterfactuals, Journal of Personality and Social Psychology, 78 (2000): 853–70.
19 Many studies have examined the varied meanings that people attach to verbal expressions of uncertainty: W. Bruine de Bruin, B. Fischhoff, S. G. Millstein, and B. L. Felscher, “Verbal and Numerical Expressions of Probability: ‘It’s a Fifty-Fifty Chance.’ ” Organizational Behavior and Human Decision Processes 81 (2000): 115–23.
20 The pioneering work focused on weather forecasters. See A. H. Murphy, “Scalar and Vector Partitions of the Probability Score, Part I, Two-Stage Situation,” Journal of Applied Meteorology 11 (1972): 273–82; A. H. Murphy, “Scalar and Vector Partitions of the Probability Score, Part II, N-State Situation,” Journal of Applied Meteorology 12 (1972): 595–600. For extensions, see R. L. Winkler, “Evaluating Probabilities: Asymmetric Scoring Rules,” Management Science 40 (1994): 1395–1405.
21 The caveat is critical. The more experts knew, the harder it often became to find indicators that passed the clairvoyance test. For instance, GDP can be estimated in many ways (we rely on purchasing power parity), and so can defense spending.
22 F. Suppe, The Structure of Scientific Theories (Chicago: University of Chicago Press, 1973); S. Toulmin, Foresight and Understanding: An Inquiry into the Aims of Science (New York: Harper & Row, 1963).
23 C. Cerf, and V. S. Navasky, eds., The Experts Speak: The Definitive Compendium of Authoritative Misinformation (New York: Pantheon Books, 1984).
24 A. Sen, Poverty and Famines (New York: Oxford University Printing House, 1981).
25 M. Feldstein, “Clinton’s Revenue Mirage,” Wall Street Journal, April 6, 1993, A14.
26 See Lester Thurow, Head to Head: The Coming Economic Battle among Japan, Europe, and America (New York: Murrow, 1992).
27 L. Savage, The Foundations of Statistics (New York: Wiley, 1954); W. Edwards, “The Theory of Decision Making,” Psychological Bulletin 51 (1954): 380–417.
28 It requires little ingenuity to design bets that turn violators of this minimalist standard of rationality into money pumps. People do, however, often stumble. See A. Tversky, and D. Koehler, “Support Theory: A Nonextensional Representation of Subjective Probability,” Psychological Review 101 (1994): 547–67.
29 P. E. Tetlock, “Theory-Driven Reasoning about Possible Pasts and Possible Futures,” American Journal of Political Science 43 (1999): 335–36. Sherman Kent, the paragon of intelligence analysts, was an early advocate of translating vague hunches into precise probabilitistic odds (S. Kent, Collected Essays (U.S. Government: Center for the Study of Intelligence, 1970), http://www.cia.gov/csi/books/shermankent/toc.html.
30 For an account of the Ehrlich-Simon bet, see John Tierney, “Betting on the Planet,” New York Times Magazine, December 2, 1990, 52–53, 74–81.
31 Suppe, The Structure of Scientific Theories; P. Laudan, Progress and Its Problems (Berkeley: University of California Press, 1986).
32 We discover how reliant we are on hidden counterfactuals when we probe the underpinnings of attributions of good or bad judgment to leaders. The simplest rule—“If it happens on your watch …”—has the advantage of reducing reliance on counterfactuals but the disadvantage of holding policy makers accountable for outcomes outside their control. Most of us want leeway for the possibilities that (a) some leaders do all the right things but—by bad luck—get clobbered; (b) other leaders violate all the rules of rationality and—by sheer dumb luck—prosper.
33 David K. Lewis, Counterfactuals (Cambridge: Harvard University Press, 1973).
34 The exact time of arrival of disappointment may, though, vary. The probability of black or red on a roulette spin should be independent of earlier spins. But political-economic outcomes are often interdependent. If one erroneously predicted the rise of a “Polish Peron,” one would have also been wrong about surging central government debt-to-GDP ratios, inflation, corruption ratings, and so on. Skeptics should predict as much consistency in who gets what right as there is interdependence among outcomes.
35 Radical skepticism as defined here should not be confused with radical relativism as defined earlier. Radical skeptics do not doubt the desirability or feasibility of holding different points of view accountable to common correspondence and coherence tests; they doubt only that, when put to these tests, experts can justify their claims to expertise.
36 The unpalatability of a proposition is weak grounds for rejecting it. But it often influences where we set our thresholds of proof. (P. E. Tetlock, “Political or Politicized Psychology: Is the Road to Scientific Hell Paved wit
h Good Moral Intentions?” Political Psychology 15 [1994]: 509–30)
37 Berlin, “The Hedgehog and the Fox.”
38 For a review of work on cognitive styles, see P. Suedfeld, and P. E. Tetlock, “Cognitive styles,” in Blackwell International Handbook of Social Psychology: Intra-Individual Processes, vol. 1, ed. A. Tesser and N. Schwartz (London: Blackwell, 2000).
39 G. Gigerenzer and P. M. Todd, Simple Heuristics That Make Us Smart (New York: Oxford University Press, 2000).
40 H. J. Einhorn and R. M. Hogarth, “Prediction, Diagnosis and Causal Thinking in Forecasting,” Journal of Forecasting 1 (1982): 23–36.
41 For expansions of this argument, see P. E. Tetlock, R. S. Peterson, and J. M. Berry, Flattering and Unflattering Personality Portraits of Integratively Simple and Complex Managers,” Journal of Personality and Social Psychology 64 (1993): 500–511; P. E. Tetlock and A. Tyler, “Winston Churchill’s Cognitive and Rhetorical Style,” Political Psychology 17 (1996): 149–70. P. E. Tetlock, D. Armor, and R. Peterson, “The Slavery Debate in Antebellum America: Cognitive Style, Value Conflict, and the Limits of Compromise,” Journal of Personality and Social Psychology 66 (1994): 115–26.
42 Peter Schwarz, The Art of the Long View (New York: Doubleday, 1991).
43 For a mathematical model for understanding the effects of “unpacking” on probability judgments, A. Tversky and D. Koehler, “Support Theory: A Nonextensional Representation of Subjective Probability,” Psychological Review 101 (1994): 547–67.
44 H. Bloom, Shakespeare: The Invention of the Human (New York: Riverhead, 1998).
CHAPTER 2
The Ego-deflating Challenge of Radical Skepticism
Expert Political Judgment Page 9