Expert Political Judgment Page 28 Read online free by Philip E. Tetlock

Home > Other > Expert Political Judgment > Page 28

Expert Political Judgment Page 28

To explore this idea, we transformed the turnabout thought experiment into an actual experiment by asking respondents how they would react if a research team working in the Kremlin archives announced the discovery of evidence that shed light on three choice points in Soviet history: whether Stalinism could have been averted in the late 1920s, whether the cold war could have been brought to an end in the mid-1950s, and whether the Politburo in the early 1980s could just as easily have responded to Reagan’s policies in a confrontational manner.14

The Methodological Appendix presents the details of the sample, research procedures, and research design which took the form of a 2 × 2 × 3 mixed-design factorial, with two between-subjects independent variables—liberal or conservative tilt of evidence discovered by hypothetical research team and the presence or absence of methodological checks on ideological bias—and one repeated-measures factor representing the three historical “discoveries.” In the liberal-tilt condition, participants imagined that a team uncovers evidence that indicates Stalinism was avertable in the late 1920s, the cold war could have ended in the mid-1950s, and Reagan almost triggered a downward spiral in American-Soviet relations in the early 1980s that could have ended in war. In the conservative-tilt condition, participants imagined that the evidence indicates that history could not have gone down a different path at each of these three junctures. In the high-research-quality condition, participants are further asked to imagine that the team took special precautions to squelch political bias. In the unspecified-quality condition, participants received no such assurances. After reading about each discovery, participants judged the credibility of the research conclusions as well as of three grounds for impugning the team’s credibility: dismissing the motives of the researchers as political rather than scholarly, disputing the authenticity of documents, and arguing that key documents were taken out of context.

Table 5.3 shows that, although there was a weak effect of methodological precautions, that effect was eclipsed by the effects of ideological preconceptions. Regardless of announced checks on bias, both liberals and conservatives rated consonant evidence as highly credible and dissonant evidence as relatively incredible. When reacting to dissonant data discovered by a team that did not implement precautions, experts used all three belief system defenses: challenging the authenticity of the archival documents, the representativeness of the documents, and the competence and the motives of the unnamed investigators. The open-ended data underscore this point. The same tactics of data neutralization were almost four times more likely to appear in spontaneous comments on dissonant than on consonant evidence (62 percent of the thought protocols produced by experts confronting dissonant data contained at least one evidence-neutralization technique versus 16 percent of the protocols produced by experts confronting consonant data, with the size of the double standard about equal for experts on opposite sides of the ideology scale). When we measure the tendency to endorse all three tactics of data neutralization, this composite scale consistently predicts rejection of the conclusions that the investigators want to draw from their “discovery” (correlations from .44 to .57 across scenarios).15

TABLE 5.3

Average Reactions to Dissonant and Consonant Evidence of Low- or High-quality Bearing on Three Controversial Close-call Counterfactuals

Note: Higher scores, high credibility (col. 1); greater resistance (cols. 2–4)

Hedgehogs were also more likely than foxes to deploy double standards in evaluating evidence. Far from changing their minds in response to dissonant discoveries, hedgehogs increased their confidence in their prior position, whereas foxes at least made small adjustments to the new evidence.16 Moreover, hedgehogs were defiant defenders of double standards. In debriefing, we asked experts how much their evaluations of the study were affected by the study’s results. Foxes were reluctant to acknowledge that they kept two sets of epistemological books and maintained their reactions would have been similar. By contrast, hedgehogs acknowledged that their reactions would have been strikingly different and defended a differential response. We return to these defenses in chapter 6 when we give hedgehogs an opportunity to respond to the whole battery of allegations of cognitive bias.17

The key point of the turnabout experiment is the pervasiveness of double standards: the tendency to switch on the high-intensity searchlight for flaws only in disagreeable results. Whether we trace the problem to excessive skepticism toward dissonant data or insufficient skepticism toward consonant data, our beliefs about what could have been can easily become self-perpetuating, insulated from disconfirming evidence by a thick protective belt of defensive maneuvers that attribute dissonant evidence to methodological sloppiness or partisan bias. It is telling that no one spontaneously entertained the possibility that “I guess the methodological errors broke in my direction this time.”

CLOSING OBSERVATIONS

This chapter underscores the power of our preconceptions to shape our view of reality. To the previous list of judgmental shortcomings—overconfidence, hindsight bias, belief underadjustment—we can add fresh failings: (a) the alacrity with which we fill in the missing control conditions of history with agreeable scenarios and with which we reject dissonant scenarios; (b) the sluggishness with which we reconsider these judgments in light of fresh evidence. It is easy, even for sophisticated professionals, to slip into tautological patterns of historical reasoning: “I know x caused y because, if there had been no x, there would have been no y. And I know that, ‘if no x, no y’ because I know x caused y.” Given the ontological inadequacies of history as teacher and our psychological inadequacies as pupils, it begins to look impossible to learn anything from history that we were not previously predisposed to learn.

These results should be unsettling to those worried about our capacity to avoid repeating the mistakes of the past. But they are reassuring to psychologists worried about the generalizability of their findings to real people judging real events. Surveying the findings laid out in chapters 2 through 5, we find several impressive points of convergence with the larger literature.

First, researchers have shown that experts, from diverse professions, can talk themselves into believing they can do things that they manifestly cannot.18 Experts frequently seem unaware of how quickly they reach the point of diminishing marginal returns for knowledge when they try to predict outcomes with large stochastic components: from recidivism among criminals to the performance of financial markets. Beyond a stark minimum, subject matter expertise in world politics translates less into forecasting accuracy than it does into overconfidence (and the ability to spin elaborate tapestries of reasons for expecting “favorite” outcomes).19

Second, like ordinary mortals, seasoned professionals are reluctant to acknowledge that they were wrong and to change their minds to the degree prescribed by Reverend Bayes. Experts judging political trends were as slow to modify their opinions as ordinary people have been to modify their views in laboratory experiments on belief perseverance. Reviewing the cognitive strategies experts used to justify holding firm, we discover a formidable array of dissonance-reduction strategies tailor-made for defusing threats to professional self-esteem.20

Third, like ordinary mortals, experts fall prey to the hindsight effect. After the fact, they claim they know more about what was going to happen than they actually knew before the fact. This systematic misremembering of past positions may look strategic, but the evidence indicates that people do sometimes truly convince themselves that they “knew it all along.”21

Fourth, like ordinary mortals, experts play favorites in the hypothesistesting game, applying higher standards of proof for dissonant than for consonant discoveries. This finding extends experimental work on theory-driven assessments of evidence22 as well as work on shifting thresholds of proof in science.23

Fifth, individual differences in styles of reasoning among experts parallel those documented in other populations of human beings. Whatever label we place on these individual difference
s—Isaiah Berlin’s classification of hedgehogs and foxes or the more prosaic taxonomies of psychologists who talk about “need for closure or structure or consistency” and “integrative simplicity-complexity”—a pattern emerges. Across several samples and tasks, people who value closure and simplicity are less accurate in complex social perception tasks and more susceptible to overconfidence, hindsight, and belief perseverance effects.24

In all five respects, our findings underscore that laboratory-based demonstrations of bounded rationality hold up in a more ecologically representative research design that focuses on the efforts of trained specialists (as opposed to sophomore conscripts) to judge complex, naturally occurring political events (as opposed to artificial problems that the experimenter has concocted with the intent of demonstrating bias).

Figure 5.1. This figure builds on figures 3.5 and 4.3 by inserting what we have learned about the greater openness of moderates, foxes, and integratively complex thinkers to dissonant historical counterfactuals. This greater willingness to draw belief-destabilizing lessons from the past increases forecasting skill via three hypothesized mediators: the tendencies to hedge subjective probability bets, to resist hindsight bias, and to be better Bayesians.

This chapter rounds out the normative indictment. Since we introduced the hedgehog-fox distinction in chapter 3, hedgehogs have repeatedly emerged as the higher-risk candidates for becoming “prisoners of their preconceptions.” Figure 5.1 integrates the key findings from chapter 5 into the conceptual model of good forecasting judgment that has been evolving through the last three chapters. In this revised scheme, moderate foxes have a new advantage over extremist hedgehogs—their greater tolerance for dissonant historical counterfactuals—in addition to their already established network of mutually reinforcing advantages: their greater capacity for self-critical integratively complex thinking, their greater flexibility as belief updaters, and their greater caution in using probability scales.

But have hedgehogs been run through a kangaroo court? Are many alleged errors and biases normatively defensible? Chapter 6 makes the case for the defense.

1 J. Fearon, “Counterfactuals and Hypothesis Testing in Political Science,” World Politics 43 (1991): 169–95, 474–84; P. E. Tetlock and A. Belkin, Counterfactual Thought Experiments in World Politics: Logical, Methodological, and Psychological Perspectives (Princeton, NJ: Princeton University Press, 1996).

2 S. Fiske and S. Taylor, “Social Cognition.”

3 N. J. Roese, “Counterfactual Thinking,” Psychological Bulletin 121 (1997): 133–48.

4 A.J.P. Taylor, The Struggle for Mastery in Europe, 1848–1918 (Oxford: Clarendon, 1954).

5 Pipes.

6 On the “theoretical implications” of the Cold War: J. L. Gaddis, “International Relations Theory and the End of the Cold War,” International Security 17 (1992): 5–58.

7 Stephen F. Cohen, Alexander Rabinowitch, and Robert Sharlet, The Soviet Union Since Stalin (Bloomington: Indiana University Press, 1985).

8 This differential predictability was not due to statistical artifacts such as differential reliability of measures or restriction-of-range artifacts.

9 N. J. Roese, “Counterfactual Thinking,” Psychological Bulletin 121 (1997): 133–48; P. E. Tetlock and P. Visser, “Thinking about Russia: Possible Pasts and Probable Futures,” British Journal of Social Psychology 39 (2000): 173–96.

10 J. A. Vasquez, “The Realist Paradigm and Degenerative versus Progressive Research Programs: An Appraisal of Neotraditional Research on Waltz’s Balancing Proposition,” American Political Science Review 91 (1997): 899–913; K. N. Waltz, Theory of International Politics (Reading, MA: Addison-Wesley, 1979).

11 S. Sagan and K. Waltz, The Spread of Nuclear Weapons: A Debate (New York: W. W. Norton, 1995).

12 P. E. Tetlock and A. Belkin, Counterfactual Thought Experiments in World Politics: Logical, Methodological, and Psychological Perspectives (Princeton, NJ: Princeton University Press, 1996).

13 J. Elster, Logic and Society: Contradictions and Possible Worlds (New York: John Wiley & Sons, 1978).

14 For a more extensive discussion of the utility of turnabout thought experiments, see P. E. Tetlock, “Political Psychology or Politicized Psychology: Is the Road to Scientific Hell Paved with Good Moral Intentions?” Political Psychology 15 (1994): 509–30.

15 For similar results in other experimental work on belief perseverance, see C. Lord, L. Ross, and M. Lepper, “Biased Assimilation and Attitude Polarization: The Effects of Prior Theories on Subsequently Considered Evidence,” Journal of Personality and Social Psychology 37 (1979): 2098–2109.

16 For parallel results in the more general literature on cognitive styles, see Kruglanski and Webster, “Need for Closure”; C. Y. Chiu, M. W. Morris, Y. Y. Hong, and T. Menon, “Motivated Cultural Cognition: The Impact of Implicit Cultural Theories on Dispositional Attributions Varies as a Function of Need for Closure,” Journal of Personality and Social Psychology 78 (2001): 247–59.

17 Experts were far more responsive to the manipulation of empirical findings than to that of research quality, ignoring the latter altogether when the data reinforced ideological preconceptions and giving only grudging consideration to high-quality data that challenged their preconceptions. Before issuing a blanket condemnation, however, we should consider three qualifications. First, not all experts ignored disagreeable evidence; some were swayed by high-quality dissonant evidence. Second, the greater effect sizes for “empirical findings” than for “research procedures” might merely reflect that we manipulated the former in a more compelling fashion than the latter. Comparisons of effect sizes across such different independent variables are notoriously problematic. Third, the data do not demonstrate that experts are too slow to accept dissonant evidence. It may be prudent to ask sharp questions of unexpected results.

18 Dawes, “Behavioral Decision Theory.”

19 H. N. Garb, Studying the Clinician: Judgment Research and Psychological Assessment (Washington, DC: American Psychological Association, 1998).

20 P. E. Tetlock, “Prisoners of our Preconceptions.”

21 S. Hawkins and R. Hastie, “Hindsight: Biased Judgment of Past Events after the Outcomes Are Known,” Psychological Bulletin 107 (1990): 311–27.

22 C. Lord, L. Ross, and M. Lopper, “Considering the Opposite: A Corrective Strategy for Social Judgement,” Journal of Personality and Social Psychology 47 (1984): 1231–43.

23 Timothy D. Wilson, Bella M. DePaulo, D. G. Mook, and K. G. Klaaren, “Scientists’ Evaluations of Research: The Biasing Effects of the Importance of the Topic,” Psychological Science 4 (September 1993): 322–25.

24 Suedfeld and Tetlock, “Individual Differences.”

CHAPTER 6

The Hedgehogs Strike Back

There are two sides to every argument, including this one.

—ANONYMOUS

IT REQUIRES better defense counsel than the author to get the hedgehogs acquitted of all the charges against them. Too many lines of evidence converge: hedgehogs are poor forecasters who refuse to acknowledge mistakes, dismiss dissonant evidence, and warm to the possibility that things could have worked out differently only if doing so rescues favorite theories of history from falsification.

That said, any self-respecting contrarian should wonder what can be said on behalf of the beleaguered hedgehogs. Fifty years of research on cognitive styles suggests an affirmative answer: it does sometimes help to be a hedgehog.1 Distinctive hedgehog strengths include their resistance to distraction in environments with unfavorable signal-to-noise ratios;2 their tough negotiating postures that protect them from exploitation by competitive adversaries;3 their willingness to take responsibility for controversial decisions guaranteed to make them enemies;4 their determination to stay the course with sound policies that run into temporary difficulties;5 and their capacity to inspire confidence by projecting a decisive, can-do presence.6

There is little doubt then that there are settings in whi
ch one does better heeding hedgehog rather than fox advice. But to dispel the cloud of doubt hovering over hedgehogs that has built up over the last three chapters, any spirited defense cannot shirk the task of rebutting the evidence in those chapters. A well-prepared hedgehog’s advocate should raise a veritable litany of objections:

1. “You claim we are flawed forecasters, but you messed up the grading by overrelying on a simplistic method of probability scoring.” We need sophisticated scoring systems that incorporate value, difficulty, and other adjustments that will even the score.

2. “You claim we are poky belief updaters, but you overrelied on simplistic Bayesian scoring rules that gave no weight to the valid arguments that experts invoked for not changing their minds.

3. “You claim to have caught us applying double standards in judging agreeable versus disagreeable evidence, but you forget that some double standards are justifiable.”

4. “You claim we use counterfactual history to prop up our prejudices, but you do not appreciate the wisdom of adopting a deterministic perspective on the past.

5. “You think you caught us falling prey to the hindsight bias, but you do not grasp how essential that ‘bias’ is to efficient mental functioning.”

6. This defense shifts from defending the answers hedgehogs gave to attacking the questions hedgehogs were asked. It concedes that there were systematic, difficult-to-rationalize biases running through hedgehog judgments but insists that if I had posed more intelligent questions, performance deficits would have disappeared.

‹ Prev Next ›