Expert Political Judgment

Page 5

by Philip E. Tetlock

A second line of resistance concedes that the marketplace of ideas may be sub-optimal, but doubts that tournaments as practiced today are a good way to improve the quality of debate. Here, the fault lies with researchers, like yours truly, who have over-played the psychometric functions of tournaments and under-played the political functions. As noted earlier, tournaments can be redesigned to appeal to senior policy professionals by using the question-clustering approach. For example, we could ask high-status incumbents and their aides to nominate questions which they think their ideological side will have a comparative advantage in answering? Each side in a debate could pose questions that play to its strengths and exploit the other side’s real or imagined blind spots. Peter Scoblic, Katie Cochran, and I launched such a pilot exercise on gjopen.com in 2015 that featured the debates among hawks, doves, and owls on the Iranian nuclear deal. If resistance to tournaments is grounded in misunderstandings about the feasibility of achieving both rigor and relevance, then it should vanish when elites see that tournaments need not be games of trivial pursuits focused on low-level, nitty-gritty issues. Elites should welcome the new opportunities to use tournaments as forums in which adversaries can collaborate to work out, much faster than usually possible, who is right about what. Adversarial-collaboration tournaments should start popping up and depolarizing debates in once bitterly contested domains—from tax cuts to charter schools to foreign policy.

This is the best-case scenario. It rests on the admittedly precarious premise that there are critical masses of elites on both sides of major debates whose motives are largely epistemic and who are open to participating in collaborative tournaments with their critics to test which side is closer to the optimal forecasting frontier on which issues.

Then there is a darker set of possibilities: what if we fall far short of critical mass? What if elites care about evidence and arguments mainly as weapons to deploy in pursuit of personal status and the ascendancy of their ideological in-group? What if, as so many journalists fear at the moment, the readiness of the Trump administration to play fast and loose with even objectively verifiable facts, like the size of inauguration-day crowds, marks our descent into yet deeper partisan solipsism?44

The third line of resistance to tournaments is grounded in the Machiavellian tenet that politics is the art of seizing power and holding on to it by convincing others that one’s position in the hierarchy is the natural order of things. It was incorrigibly naïve ever to have expected elites to engage with tournaments: Why would anyone savvy enough to reach the pinnacle of an opinion-leader hierarchy submit to a game in which the best outcome is to vindicate one’s status by breaking even and the more likely outcomes are the erosion of reputational capital that one labored a lifetime to accumulate?

Keep in mind that, appearances notwithstanding, the life of world-class pundits is not easy. There are plenty of people trying to push them off their perch. To justify their standing, elites are under continual pressure to be interesting, to appear to know things about the future that are not accessible to ordinary mortals. They must master the art of appearing to go out on a limb without actually venturing out. From a career perspective, giving up the cover of vague-verbiage forecasting would be reckless. As soon as they are caught on the wrong side of maybe, they will be pilloried just as prediction markets were on Obamacare or Nate Silver was on the rise of Donald Trump.45 The right formula is to tell catchy stories about dramatic events—like the collapse of the British economy or a military clash between NATO and Russia—that “could” happen if we do/don’t follow the recommended path. That way, regardless of what happens, one is covered.

If this jaundiced view is correct, efforts to make tournaments policy-relevant will repel, not attract elites. Losing a tournament stocked with questions both sides see as probative would be far more credibility-corrosive than losing a tournament stocked with easily dismissed trivial-pursuits questions.

The final line of resistance is grounded in the sorts of processes that psychologists since Freud have stressed in one form or another: the unconscious drivers of human behavior.46 Harold Lasswell in 1930 pithily defined politics as the displacement of private motives onto public objects followed by the concealment and rationalization of that process with rhetoric about the public good. More recently, Jonathan Haidt has proposed that politics is driven by tribal attachments to communities of co-believers that bind and blind.47 First we feel, and then we invent cognitions to justify those feelings. In these views, neither elite pundits nor their readers are as thoughtful as they fancy themselves. Tournaments have little appeal because they pressure people to do something profoundly unnatural: to treat their beliefs not as all-or-none markers of tribal identification, but as probability judgments that are matters of degree and need to be continually tweaked or abandoned in response to events. For most people, tournaments are not nearly as much fun as basking in the reflected virtue of one’s own in-group and blasting the villainy of the other guys.48

Mixing Machiavelli, Freud, Lasswell, and Haidt produces a potent cocktail for despair that politics can never be more than an endless succession of power plays bolstered by self-serving, emotion-soaked rationalizations. That is why I am pinning my hopes on the gradual emergence of elites who treat the truth as an end in itself and are ready to risk damage to both their personal egos and their public standing by entering into transparent level-playing-field competitions with their adversaries.

CLOSING OBSERVATIONS

I understand why some saw EPJ as the definitive, thumbs-down verdict on expert political judgment. But I have always seen it as a conversation-starter, as a first step toward thinking more deeply about the criteria we use in judging judgment—and the challenges of designing systems that encourage us to treat our beliefs as testable hypotheses that we hold with varying degrees of probabilistic conviction. At a moment of surging concern about the adequacy of the marketplace of ideas to flush out falsehoods, tournaments provide a quiet sanctuary in which opponents can compete to become shrewder forecasters, not cleverer polemicists.

EPJ is, at its idealistic core, an Enlightenment project. Experts and their schools of thought will always wax or wane in influence, sometimes by virtue of their skill and sometimes by sheer dumb luck. Tournaments will not stop this process of political natural selection but they can improve the assertion-to-evidence ratios for observers struggling to weigh the merits of clashing claims. And the information value of tournaments will be multiplied whenever adversaries can agree, ex ante, on which shorter-term developments are diagnostic of which longer-term trends. For then they will have made more than progress toward resolving policy differences. They will also have refined our collective understanding of progress, as movement away from a world of murky post-truth relativism and toward a world with transparent standards for judging judgment that command some bipartisan deference.

If we can agree that the future is best viewed through a probabilistic lens, we will have taken a big step toward increasing our ability to craft policies for improving it. This, I hope, will be the legacy of Expert Political Judgment: forecasting tournaments that become pipelines of nonpartisan probability judgments flowing into policy debates, reducing bickering, and even fostering comity. Let’s forget the dart-tossing chimps and embrace the human project of inventing ever more civilized tools for resolving disputes.

1 On the catchiness of the “dart-throwing chimp” meme: L. Menand, “Everybody’s an Expert,” December 5, 2005, Print edition; C. Bialik, “Evaluating Political Pundits,” Wall Street Journal, January 6, 2006, www.wsj.com/articles/SB113631113499836645; N. Cohen, “Prophet Warning,” The Guardian, January 1, 2006, https://www.theguardian.com/politics/2006/jan/01/politicalcolumnists.observercolumnists; N. Kristof, “Learning How to Think,” New York Times, March 26, 2009, A27; T. Butterworth, “Prophets of Error,” Wall Street Journal, April 30, 2011, www.wsj.com/articles/SB10001424052748703983104576263250780465470; T. Harford, “Of Foxes, Hedgehogs and the Art of Financial Foreca
sting,” Financial Times, December 23, 2011, https://www.ft.com/content/5e8a2c78-2c90-11e1-8cca-00144feabdc0; J. Stevens, “Political Scientists Are Lousy Forecasters,” New York Times, June 24, 2012, SR6; D. Robson, “The Best Way to Predict the Future,” BBC, June 12, 2014, www.bbc.com/future/story/20140612-the-best-way-to-see-the-future; T. Harford, “How to See into the Future,” Financial Times, September 5, 2014, https://www.ft.com/content/3950604a-33bc-11e4-ba62-00144feabdc0#axzz3CTBRBAGH; “Predicting the Future: Unclouded Vision,” Economist, September 26th, 2015, Print edition.

2 Tetlock and Gardner, Superforecasting, 110–13, 114.

3 M. Gove, “‘Experts’ Like Carney Must Curb Their Arrogance,” The Times, October 21, 2016, http://www.thetimes.co.uk/article/experts-like-carney-must-curb-their-arrogance-jkffvjlzm; J. Cleese (@JohnCleese), “I admire your confidence Stuart. But, knowing Tetlock, I realise it will be some time before the advantages and disadvantages stack up,” November 9, 2016 1:10 AM, https://twitter.com/JohnCleese/status/796233469240086528

4 P. E. Meehl, Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review of the Evidence. (Minneapolis, University of Minnesota Press, 1958).

5 P. E. Meehl, “Causes and Effects of My Disturbing Little Book,” Journal of Personality Assessment, 50 (3), (1986): 370–75; R. M. Dawes, D. Faust, & P. E. Meehl, “Clinical Versus Actuarial Judgment,” Science, 243(4899) (1989): 1668–1674; W. M. Grove, D. H. Zald, B. S. Lebow, B. E. Snitz, & C. Nelson, Clinical Versus Mechanical Prediction: A Meta-Analysis.” Psychological Assessment, 12 (1) (2000): 19–30.

6 D. Kahneman, Thinking, Fast and Slow (New York: Farrar, Strauss & Giroux, 2011).

7 J. Baron, Thinking and Deciding, 4th ed., (New York: Cambridge University Press, 2009).

8 G. A. Klein, Sources of Power: How People Make Decisions (Cambridge, Mass: MIT Press, 1998).

9 D. Kahneman, & G. Klein, “Conditions for Intuitive Expertise: a Failure to Disagree,” American Psychologist, 64 (6), (2009): 515–26.

10 I. Berlin, “The Hedgehog and the Fox,” in The Proper Study of Mankind, 436–98 (New York: Farrar, Straus & Giroux, 1997). Berlin traces the distinction—via Erasmus—2,600 years to a shadowy source on the edge of recorded Greek history: the soldier-poet Archilocus. The metaphorical meaning oscillates over time, but it never strays far from eclectic cunning (foxes) and dogged persistence (hedgehogs).

11 Reasonable people can disagree about how capricious our world is. But the strongest forms of the Black-Swan argument and butterfly-effect argument treat all efforts at improving forecasting as quixotic, as something that sensible people who understand “tail risks” or “sensitive dependence on initial conditions” would not waste time on. But even this dispute is bridgeable. There was little tension between my positions in EPJ in 2005 and Taleb’s positions in the Black Swan in 2006. We diverged only after Taleb’s 2012 book, Anti-Fragility which depicted forecasting as a distraction from the higher goal of “anti-fragiliziing” institutions against extreme-tail risks. Yet “anti-fragilizing” is expensive, so we have to set priorities which inevitably rest on probability judgments.

12 J. Baron, Thinking and Deciding 4th ed., (New York: Cambridge University Press, 2009).

13 The principle of error balancing was a cornerstone of the training that the Good Judgment Project offered its forecasters in the IARPA tournaments—and the rationale is developed in W. Chang and P.E. Tetlock, “Rethinking the Training of Intelligence Analysts,” Intelligence and National Security Journal 25 (2016): 1–18.

14 J. S. Fishkin, Democracy and Deliberation: New Directions for Democratic Reform, vol. 217. (New Haven: Yale University Press, 1991); J. Elster, Deliberative Democracy, vol. 1. (New York: Cambridge University Press, 1998).

15 D. V. Budescu and T. S. Wallsten, “Consistency in Interpretation of Probabilistic Phrases,” Organizational Behavior and Human Decision Processes 36, no. 3 (1985): 391–405; D. V. Budescu and T.S. Wallsten, “Dyadic Decisions with Numerical and Verbal Probabilities” Organizational Behavior and Human Decision Processes, 46, no. 2 (1990): 240–63; T. S. Wallsten, D. V. Budescu, R. Zwick, and S. M. Kemp, “Preferences and Reasons for Communicating Probabilistic Information in Verbal or Numerical Terms,” Bulletin of the Psychonomic Society, 31, no. 2 (1993): 135–38; T. S. Wallsten, D. V. Budescu, and R. Zwick, “Comparing the Calibration and Coherence of Numerical and Verbal Probability Judgments,” Management Science, 39, no. 2 (1993): 176–90; C. C. Gonzalez-Vallejo, I. Erev, and T. S. Wallsten, “Do Decision Quality and Preference Order Depend on Whether Probabilities are Verbal or Numerical?” The American Journal of Psychology (1994): 157–72.

16 S. Wang, Is 99 percent a reasonable probability?, http://election.princeton.edu/2016/11/06/is-99-a-reasonable-probability/ (November 6th, 2016)

17 Emphasizing the non-epistemic functions of human judgment is an old theme in my experimental work: P.E. Tetlock, “Alternative Social Functionalist Frameworks for Judgment and Choice: People as Intuitive Politicians, Theologians and Prosecutors,” Psychological Review 109 (2002): 451–71. It is also a recurring theme in the work of other scholars, including J. Haidt, The Righteous Mind. (New York: Random House, 2012) as well as the work of H. Mercier and D. Sperber, “Why Do Humans Argue? Arguments for an Argumentative Theory.” Behavioral and Brain sciences 34 (2011): 57–111.

18 D. Griffin, & L. Ross, “Subjective Construal, Social Inference, and Human Misunderstanding,” in Advances in Experimental Social Psychology, ed. M. P. Zanna, 319–59 (San Diego, CA: Academic Press, 1991).

19 P. E. Tetlock, “Accountability and Complexity of Thought.” Journal of Personality and Social Psychology, 45(1)(1983):74; P. E. Tetlock, L. Skitka, & R. Boettger, “Social and Cognitive Strategies for Coping with Accountability: Conformity, Complexity, and Bolstering,” Journal of Personality and Social Psychology, 57 (4) (1989): 632.

20 The dream of adversarial collaboration tournaments remains largely a dream. It is difficult to draw in extremists. P. E. Tetlock and J. P. Scoblic, “The Power of Precise Predictions,” New York Times, October 2, 2015, SR10;

21 This project has stimulated peer-reviewed publications in psychology, management, political science and statistics. P.E. Tetlock, B. Mellers, N. Rohrbaugh, and E. Chen, “Forecasting Tournaments: Tools for Increasing Transparency and Improving the Quality of Debate,” Current Directions in Psychological Science (2014): 290–95; P.E. Tetlock and B. Mellers, “Judging Political Judgment” Proceedings of National Academy of Sciences 111, no. 32 (2014): 11574–11575; Mellers et al., “Psychological Strategies for Winning a Geopolitical Tournament,” Psychological Science 2, no. 5 (2014): 1106–115; B. Mellers, E. Stone, T. Murray, A. Minster, N. Rohrbaugh, M. Bishop, E. Chen, J. Baker, Y. Hou, M. Horowitz, L. Ungar and P.E. Tetlock, “Identifying and Cultivating ‘Superforecasters’ as a Method of Improving Probabilistic Predictions,” Perspectives in Psychological Science 10, no. 3 (2015): 267–81; B. Mellers, L. Ungar, K. Fincher, M. Horowitz, P. Atanasov, S. Swift, T. Murray and P.E. Tetlock, “The Psychology of Intelligence Analysis: Drivers of Prediction Accuracy in World Politics,” Journal of Experimental Psychology: Applied 21, no. 1 (2015): 1–21; P. Schoemaker and P.E. Tetlock, “Superforecasting: How to Improve Your Company’s Judgment,” Harvard Business Review, May (2016): 72–78; E. Chen, D. Budescu, B. Mellers and P. E. Tetlock, “Validating the Contribution-weighted Model: Robustness and Cost-Benefit Analyses,” Management Science (2016), published online ahead of print April 18, http://dx.doi.org/10.1287/deca.2016.0329; E. Merkle, M. Steyvers, B. Mellers and P. E. Tetlock, “Item Response Models of Probability Judgments: Application to a Geopolitical Forecasting Tournament,” Decision 3, no. 1 (2016): 22–33; D. Moore, S. Swift, B. Mellers and P. E. Tetlock, “Confidence Calibration in a Multi-year Geopolitical Forecasting Competition,” Management Science (in press); P. Atanasov, P. Rescober, E. Stone, S. Swift, E. Servan-Schreiber, P. E. Tetlock, L. Unger and B. Mellers, “Distilling the Wisdom of Crowds: Prediction Markets vs. Prediction Polls,” Management Science
(2016).

22 The EPJ and IARPA forecasting tournaments underscored in different ways the importance of cognitive style. In each tournament, actively open-minded forecasters, in Jonathan Baron’s sense of the term, did better. See Tetlock and Gardner, Superforecasting.

23 Mellers et al., “Psychological Strategies for Winning a Geopolitical Tournament.”

24 Tetlock and Gardner, Superforecasting.

25 Atanasov et al., “Distilling the Wisdom of Crowds”; Tetlock et al., “Forecasting tournaments”; Tetlock and Mellers, “Judging Political Judgment”; Mellers et al., “Psychological Strategies for Winning a Geopolitical Tournament;” Tetlock and Gardner, Superforecasting, 16–17, 85, 87; Mellers et al., “Identifying and Cultivating ‘Superforecasters’ as a Method of Improving Probabilistic Predictions”; Mellers et al., “The Psychology of Intelligence Analysis”; Schoemaker and Tetlock, “Superforecasting”; W. Chang, P. Atanasov, E. Chen, B. Mellers and P.E. Tetlock, “Developing Expert Political Judgment: The Impact of Training and Practice on Judgmental Accuracy in Geopolitical Forecasting Tournaments,” Judgment and Decision Making 11, no. 5 (2016): 509–26; Chang and Tetlock, “Rethinking the Training of Intelligence Analysts”; Chen et al., “Validating the Contribution-Weighted Model”; Merkle et al., “Item Response Models of Probability Judgments”; Moore et al., “Confidence Calibration in a Multi-year Geopolitical Forecasting Competition.”

‹ Prev Next ›