Expert Political Judgment

Home > Other > Expert Political Judgment > Page 36
Expert Political Judgment Page 36

by Philip E. Tetlock


  Research participants, fortunately, had the temerity to challenge these accuracy criteria. Sometimes they argued that what they predicted really did happen and that the author’s reality checks were flawed, hence the need for controversy adjustments to probability scoring. Sometimes they conceded that the predicted outcome did not occur but that it almost occurred and might still occur, hence the need for fuzzy-set adjustments.

  These objections highlight the need to rethink the objectivist ontology underlying this project. The world is better viewed through a more pluralistic prism that allows for shades of grey between the real and unreal, for clashing perspectives on what is real or unreal, and even for the possibility that what we call the actual world is but one of a multiplicity of possible worlds, some of which may have once been considerably more likely than our world. The author’s house-of-cards argument crumbles when we replace his “naïve” objectivist framework with a sophisticated one that leaves room for legitimate differences of opinion on what is, what could have been, and what might yet be.

  The epicyclical complexities of scoring “adjustments” are proof that the author’s initial fears were sound: he did show bad scientific judgment in trying to objectify good political judgment. The author fell prey to a linguistic illusion. He inferred that, because people casually talk about something—good judgment—as though it were a unitary thing that we possess to varying degrees, that thing must exist in some quantifiable form. This project is the by-product of excessive literalism.

  Hard-line Neopositivist

  If the point of this project had been to derive a single numerical estimate of good judgment, a political IQ score that would rank experts along an ordinal scale, these criticisms might sting. But the author backed off from that reductionist goal at the outset. Indeed, he retreated too fast and too far: he bent over backward to accommodate virtually every self-serving protest that floundering forecasters advanced. In the process, he came close to doing exactly what his relativist critic just did: slip off the solipsistic precipice into the cloud-cuckoo land of relativism where no one need acknowledge mistakes because they can invoke value adjustments that give them credit for making the right mistake or, most ridiculous of all, fuzzy-set adjustments that give them credit for being almost right. The author narrowly dodged this fate in chapter 6 by placing at least some limits on how precisely tailored value adjustments can be to forecasters’ mistakes and on how generously fuzzy-set adjustments could be extended whenever forecasters cover up their mistakes with “I was almost right” exercises.

  The burden of proof now properly falls on relativists who want even more extravagant scoring adjustments. The time has come to put up or shut up: to propose schemes for assessing the accuracy and defensibility of real-world judgments that do not simply defer to every excuse experts offer.

  Moderate Neopositivist

  My hard-line ally draws sharper dichotomies than I do. But we agree that relativists take the greatest strength of the neopositivist framework—its flexible capacity to absorb “principled” objections and transform them into “technical” adjustments—and depict it as a fatal weakness. The gradual “complexification” of the author’s measures of good judgment does not nullify the entire approach. If anything, it brings the advantages of the neopositivist approach into focus. Hopelessly vague arguments about “who got what right” can take more tractable forms. We can say that the pragmatic foxes outperform the theory-driven hedgehogs, within certain ranges of difficulty, value, controversy, and even fuzzy-set adjustments, but if you introduce sufficiently extreme adjustments, then the differences between the two cognitive-stylistic groups disappear. We gain a framework for thinking about thinking that tells us the boundary conditions we should place on the generalization that experts with certain styles of thinking outperform experts with other styles of thinking.

  Reasonable Relativist

  Neopositivists can, if they wish, treat the “complexification” of measures of good judgment as progress. But note how much they have had to acknowledge the “perspectival” character of good judgment. Eventually, they may even recognize that disagreements over definitions of good judgment should not be relegated to measurement error; such disagreements should be the central focus of inquiry.

  Moderate Neopositivist

  Different phenomena are of interest for different purposes. My quarrel with the gentle relativist critic reduces to whether we should view the glass as partly full or still mostly empty. Relativists are too pessimistic because they adopt too tough a baseline of comparison. They ask: Have we created perfectly theory-free and value-free indicators of good judgment? And the answer is, “Of course not.” But a more reasonable baseline is whether we know more about good judgment variously conceived than we did before. Intellectual progress is sometimes possible only if we are prepared to jettison traditional ways of doing things, to experiment with new formats for expressing judgments (like subjective probability scales) and new methods of scoring judgments for accuracy and defensibility (like probability scoring—with adjustments, or reputational bets—with weighting of belief system defenses).

  Unrelenting Relativist

  Let the moderates meet at the mushy middle if they wish. But it is a mistake to forget that, in the end, this “objectivist” project reduces to a power grab: a bid by one scholarly community to impose its favorite analytical categories on other communities. After we strip away the highfalutin rhetoric, it comes down to a high-handed effort to tell people how to think. If you don’t think the way the self-appointed arbiters of good judgment say you should think, you can’t qualify as a good judge.

  Such high-handedness is all the more insufferable when it comes from pseudoscientists whose quantitative indicators rest on nothing more substantial than their opinion versus someone else’s. Whose value priorities in forecasting should we use: those who fear false alarms but are tolerant of misses or those who dread misses but are indulgent toward false alarms? Whose estimates of the base rates to enter into difficulty adjustments should we use: those who define the reference population of coup-prone states broadly or those who define the population narrowly? And what should we do when forecasters argue that the events they predicted really did happen (controversy adjustments) or that the events they predicted almost happened or will eventually happen (fuzzy-set adjustments)? Should we scold these forecasters for being bad Bayesians or congratulate them for having the courage of their convictions?

  So-called belief system defenses are justified acts of intellectual rebellion against arbitrary neopositivist strictures for classifying answers as right or wrong. Close-call counterfactuals pose a metaphysical challenge. Who is the author to rule out the possibility that futures once widely deemed possible almost came into being and would have but for inherently unforeseeable accidents of fate? What privileged access to the counterfactual realm gives him warrant to announce that experts were wrong to have assigned an 80 percent or 90 percent probability to futures that never materialized? How does he know that we don’t happen to live in an unlikely, extremely unlikely world? The off-on-timing defense poses a similar challenge. Who appointed the author God: the epistemological umpire who calls forecasters “out” if a predicted outcome does not happen in the designated time range?

  Hard-line Neopositivist

  If we took the aggressive-relativist critique seriously, we would commit ourselves to a brand of cognitive egalitarianism in which it is bad manners to challenge anyone else’s conception of good judgment. Everything would reduce to Humpty Dumpty’s question: “Who is to be master?” As soon as someone raises a protest, we must either accommodate it or stand indicted of intellectual imperialism—of trying to reduce vast areas of political science to the status of disciplinary colonies of psychologists who study judgment and choice.

  That might not be such a bad thing. There is plenty of evidence of cognitive biases among political observers that the usual academic quality-control mechanisms are not up to correcting. But let’s f
ollow the thread of the strong-relativist argument. If we all get to keep our own scorecards and to make whatever post hoc adjustments we want to our probability scores and belief-updating equations, we wind up back in the subjectivist swamp we vowed to escape in chapter 1. Relativists not only refuse to help; they try to push us back into the quagmire every time we try to lift ourselves out. Anytime anyone proposes a trans-ideological standard for evaluating claims, relativists deride the effort as a power grab.

  The reductio ad absurdum is, of course, that strong relativism is self-refuting. It too can be dismissed as a power grab—a power grab by radically egalitarian intellectuals who are skeptical of science and hostile to modernity. Indeed, it is hard to imagine a doctrine that is more polar opposite to the author’s agenda. Radical relativists transform scientific debates into ideological ones; the author transforms ideological debates into scientific ones.

  It is tempting to end this exchange by returning to Dr. Johnson’s famous rebuke of Bishop Berkeley “I refute him thus,” where “thus” involved kicking a stone to dispel doubts about the existence of the external world. Here the stone-kicking involves pointing to errors so egregious that no one with any sense rises to their defense. We don’t have to look long. The death of the most famous strong relativist of recent times, Michel Foucault, offers just such an object lesson. In The Lives of Michel Foucault, David Macey tells a chilling tale of the consequences of acting on the postmodern doctrine that truth is a social construct. Foucault argued there are no objective truths, only truths promoted by dominant groups intent on preserving their dominance. In the early 1980s Foucault was infected with the AIDS virus. Like many others then, he dismissed the mounting evidence that a lethal epidemic was sweeping through the gay community. The “gay plague” was rumormongering by homophobes.

  Die-hard relativists might insist that Foucault will be posthumously vindicated. Or, upping the ante, they might argue that Thabo Mbeki, president of South Africa, will yet be vindicated for flirting with conspiracy theories of AIDS and denying poor pregnant women antiretroviral drugs. History will not be kind to this school of thought.

  Moderate Neopositivist and Reasonable Relativist

  The debate has again become unnecessarily polarized. Relativists are right that there is a certain irreducible ambiguity about which rationales for forecasting glitches should be dismissed as rationalizations and which should be taken seriously. And neopositivists are right that the objectivist methods used here are well suited for identifying double standards and giving us a precise sense of how much persuasive weight we must give to rationalizations for forecasting failures to close the gap in forecasting performance between hedgehogs and foxes, or any other groups.

  Unrelenting Relativist

  Neopositivist social scientists like to wrap themselves in the successes of their big brothers and sisters in the biological and physical sciences. And exploiting a personal tragedy to undercut a scholar’s posthumous reputation is a low blow. Rather than dignify demagoguery with a response, let’s shift to topics where there is limited potential for agreement. In chapter 7, neopositivist research methods—experiments that explored the effects of question framing and unpacking—yielded results consistent with both the author’s theory and core tenets of constructivism. The results repeatedly showed that the answers historical observers reach hinge on the questions they posed. History looks slightly more contingent when we pose the query “At what point did alternative outcomes become impossible?” than when we pose the logically equivalent query “At what point did the observed outcome become inevitable?” And history looks massively more contingent when we unpack questions about alternative counterfactual outcomes into more specific sub-scenarios. Where we begin inquiry can thus be a potent determinant of where we end it. And inasmuch as there is a large set of “reasonable” starting points, there is an equally large set of reasonable termination points. Truth is perspectival, and the cognitive research program infuses this postmodern insight with deeper meaning by specifying exactly how our existing perspective refracts our view of possible pasts and futures.

  Unfortunately, although we can agree that historical observers do “construct” historical knowledge, this brief convergence of views breaks down when the author settles back into his habit of passing judgment on whether people were up to snuff on this or that correspondence or coherence “test” of good judgment. The author fixates on an apparent paradox: the admittedly odd phenomenon of sub-additive probabilities in which experts wind up judging entire sets of possibilities to be less than the sum of their exclusive and exhaustive components. Using formal logic and probability theory as benchmarks for good judgment, the author’s first instinct is to portray the framing and unpacking effects as evidence of incoherence. After all, he reasons, how can the likelihood of a set of possible futures be less than the sum of its subsets? How can we possibly justify subjective probabilities that sum to more than unity?

  By contrast, when we relativists “see” smart people doing something “stupid,” our first instinct is to ask whether we are imposing an inappropriate mental framework on those we observe. Relativists are epistemological pluralists, and we suspect that neopositivists make a “category mistake” when they label sub-additivity an error. Jerome Bruner’s brand of epistemic pluralism helps us to see why. His theory of knowledge allows for two distinct modes of ordering experience: the logico-scientific and the narrative-humanistic. “Efforts to reduce one mode to the other or to ignore one at the other’s expense inevitably fail to capture the rich diversity of thought.”5

  The author’s category mistake was to apply the standards of formal logic to thinking organized around storytelling—and organized this way for good reasons. Philosophers of history have noted that narrative explanations are more flexible and thus better suited than cumbersome covering laws to making sense of quirky concatenations of events that unfold only once and that force us to rely on what-if guesswork to infer causality.6 Narratives are so compelling, in this view, because they are so lifelike: they capture contingencies that so frequently arise in daily life. There should be no mystery why storytelling predates probability theory by several millennia: stories map more readily onto human experience.

  To preempt another cheap shot by the hard-line neopositivist, I stress that I am not trying to defend sub-additivity in the court of formal logic; rather, I seek to move the normative case to a jurisdiction governed by “narrativist” norms. The standards in this new court stress thematic coherence and imaginative evocativeness. It is no more surprising that storytellers fail coherence tests of subjective probabilities than that musicians flunk tests in acoustical engineering or that painters don’t know the wave/particle theory of light.

  Here is the soft methodological underbelly of this project: the misbegotten notion that it makes sense to rely on benchmarks of good judgment derived from probability theory. The probability calculus is inappropriate. Questions posed in probabilistic terms require experts to shift into an unnatural discourse: to abandon a free-flowing storytelling mode that traces the rich interconnections among events and to adopt a logical mode that requires affixing artificially precise probabilities to arbitrarily restrictive hypotheses about states of nature.

  Hard-line Neopositivist

  Sub-additivity is so flagrant a violation of formal rationality, not to mention common sense, that I thought only the lunatic fringe would challenge its status as an error. So, it is revealing that the strong relativist chose to take a stand even here.

  Let’s scrutinize the claim that sub-additivity ceases to be an error when we evaluate expert judgment not against the canons of logic but against those of good storytelling. This live-and-let-live complementarity argument treats the two modes of knowing as if they existed in qualitatively different realms and as if contradictions between them were the inventions of confused souls who make the “category mistake” of plugging standards appropriate to one arena of life into arenas properly governed by other standards.
We should thus judge narrativists and scientists by separate standards: those of good storytelling and good hypothesis testing.

  As a formula for academic civility between the humanities and sciences, this approach is commendable. But as a formula for coping with everyday life, the complementarity thesis is inadequate.7 One can sympathize with the separate-but-equal sentiment but still be stuck with the practical problem of when to be guided by one or the other account. Consider how easy it is to tell engrossing, completely true, tales of airplane accidents that overinflate our probability estimates of the risk of flying. It is hard, moreover, to allay these anxieties, once aroused, with dispassionate recitals of statistics on the true risk per passenger mile of different modes of transportation. People who are swayed by the stories and drive rather than fly across the United States will wind up injured or dead in greater numbers than those who heed the statistical evidence. Indeed, in the long run, the 9/11 attacks may well claim their largest number of victims by diverting people into automobile travel.8 Sometimes there are right answers: sub-additivity is an error, no matter how frantically relativists try to spin it into something else.

  Let’s also scrutinize the claim that it is somehow unnatural to think probabilistically about the events under examination here. It implies that we can do away with efforts to quantify uncertainty and judge judgment by reference to other (conveniently unspecified) benchmarks.

 

‹ Prev