The Signal and the Noise

Home > Other > The Signal and the Noise > Page 8
The Signal and the Noise Page 8

by Nate Silver


  “Foxes often manage to do inside their heads what you’d do with a whole group of hedgehogs,” Tetlock told me. What he means is that foxes have developed an ability to emulate this consensus process. Instead of asking questions of a whole group of experts, they are constantly asking questions of themselves. Often this implies that they will aggregate different types of information together—as a group of people with different ideas about the world naturally would—instead of treating any one piece of evidence as though it is the Holy Grail. (FiveThirtyEight’s forecasts, for instance, typically combine polling data with information about the economy, the demographics of a state, and so forth.) Forecasters who have failed to heed Tetlock’s guidance have often paid the price for it.

  Beware Magic-Bullet Forecasts

  In advance of the 2000 election, the economist Douglas Hibbs published a forecasting model that claimed to produce remarkably accurate predictions about how presidential elections would turn out, based on just two variables, one related to economic growth and the other to the number of military casualties.31 Hibbs made some very audacious and hedgehogish claims. He said accounting for a president’s approval rating (historically a very reliable indicator of his likelihood to be reelected) would not improve his forecasts at all. Nor did the inflation rate or the unemployment rate matter. And the identity of the candidates made no difference: a party may as well nominate a highly ideological senator like George McGovern as a centrist and war hero like Dwight D. Eisenhower. The key instead, Hibbs asserted, was a relatively obscure economic variable called real disposable income per capita.

  So how did the model do? It forecasted a landslide victory for Al Gore, predicting him to win the election by 9 percentage points. But George W. Bush won instead after the recount in Florida. Gore did win the nationwide popular vote, but the model had implied that the election would be nowhere near close, attributing only about a 1 in 80 chance to such a tight finish.32

  There were several other models that took a similar approach, claiming they had boiled down something as complex as a presidential election to a two-variable formula. (Strangely, none of them used the same two variables.) Some of them, in fact, have a far worse track record than Hibbs’s method. In 2000, one of these models projected a nineteen-point victory for Gore and would have given billions-to-one odds against the actual outcome.33

  These models had come into vogue after the 1988 election, in which the fundamentals seemed to favor George H. W. Bush—the economy was good and Bush’s Republican predecessor Reagan was popular—but the polls had favored Michael Dukakis until late in the race.34 Bush wound up winning easily.

  Since these models came to be more widely published, however, their track record has been quite poor. On average, in the five presidential elections since 1992, the typical “fundamentals-based” model—one that ignored the polls and claimed to discern exactly how voters would behave without them—has missed the final margin between the major candidates by almost 7 percentage points.35 Models that take a more fox-like approach, combining economic data with polling data and other types of information, have produced more reliable results.

  Weighing Qualitative Information

  The failure of these magic-bullet forecasting models came even though they were quantitative, relying on published economic statistics. In fact, some of the very worst forecasts that I document in this book are quantitative. The ratings agencies, for instance, had models that came to precise, “data-driven” estimates of how likely different types of mortgages were to default. These models were dangerously wrong because they relied on a self-serving assumption—that the default risk for different mortgages had little to do with one another—that made no sense in the midst of a housing and credit bubble. To be certain, I have a strong preference for more quantitative approaches in my own forecasts. But hedgehogs can take any type of information and have it reinforce their biases, while foxes who have practice in weighing different types of information together can sometimes benefit from accounting for qualitative along with quantitative factors.

  Few political analysts have a longer track record of success than the tight-knit team that runs the Cook Political Report. The group, founded in 1984 by a genial, round-faced Louisianan named Charlie Cook, is relatively little known outside the Beltway. But political junkies have relied on Cook’s forecasts for years and have rarely had reason to be disappointed with their results.

  Cook and his team have one specific mission: to predict the outcome of U.S. elections, particularly to the Congress. This means issuing forecasts for all 435 races for the U.S. House, as well as the 35 or so races for the U.S. Senate that take place every other year.

  Predicting the outcome of Senate or gubernatorial races is relatively easy. The candidates are generally well known to voters, and the most important races attract widespread attention and are polled routinely by reputable firms. Under these circumstances, it is hard to improve on a good method for aggregating polls, like the one I use at FiveThirtyEight.

  House races are another matter, however. The candidates often rise from relative obscurity—city councilmen or small-business owners who decide to take their shot at national politics—and in some cases are barely known to voters until just days before the election. Congressional districts, meanwhile, are spread throughout literally every corner of the country, giving rise to any number of demographic idiosyncrasies. The polling in House districts tends to be erratic at best36 when it is available at all, which it often isn’t.

  But this does not mean there is no information available to analysts like Cook. Indeed, there is an abundance of it: in addition to polls, there is data on the demographics of the district and on how it has voted in past elections. There is data on overall partisan trends throughout the country, such as approval ratings for the incumbent president. There is data on fund-raising, which must be scrupulously reported to the Federal Elections Commission.

  Other types of information are more qualitative, but are nonetheless potentially useful. Is the candidate a good public speaker? How in tune is her platform with the peculiarities of the district? What type of ads is she running? A political campaign is essentially a small business: How well does she manage people?

  Of course, all of that information could just get you into trouble if you were a hedgehog who wasn’t weighing it carefully. But Cook Political has a lot of experience in making forecasts, and they have an impressive track record of accuracy.

  Cook Political classifies races along a seven-point scale ranging from Solid Republican—a race that the Republican candidate is almost certain to win—to Solid Democrat (just the opposite). Between 1998 and 2010, the races that Cook described as Solid Republican were in fact won by the Republican candidate on 1,205 out of 1,207 occasions—well over 99 percent of the time. Likewise, races that they described as Solid Democrat were won by the Democrat in 1,226 out of 1,229 instances.

  Many of the races that Cook places into the Solid Democrat or Solid Republican categories occur in districts where the same party wins every year by landslide margins—these are not that hard to call. But Cook Political has done just about as well in races that require considerably more skill to forecast. Elections they’ve classified as merely “leaning” toward the Republican candidate, for instance, have in fact been won by the Republican about 95 percent of the time. Likewise, races they’ve characterized as leaning to the Democrat have been won by the Democrat 92 percent of the time.37 Furthermore, the Cook forecasts have a good track record even when they disagree with quantitative indicators like polls.38

  I visited the Cook Political team in Washington one day in September 2010, about five weeks ahead of that November’s elections, and spent the afternoon with David Wasserman, a curly-haired thirtysomething who manages their House forecasts.

  The most unique feature of Cook’s process is their candidate interviews. At election time, the entryway to the fifth floor of the Watergate complex, where the Cook offices are located, becomes a literal r
evolving door, with candidates dropping by for hourlong chats in between fund-raising and strategy sessions. Wasserman had three interviews scheduled on the day that I visited. He offered to let me sit in on one of them with a Republican candidate named Dan Kapanke. Kapanke was hoping to unseat the incumbent Democrat Ron Kind in Wisconsin’s Third Congressional District, which encompasses a number of small communities in the southwestern corner of the state. Cook Political had the race rated as Likely Democrat, which means they assigned Kapanke only a small chance of victory, but they were considering moving it into a more favorable category, Lean Democrat.

  Kapanke, a state senator who ran a farm supply business, had the gruff demeanor of a high-school gym teacher. He also had a thick Wisconsin accent: when he spoke about the La Crosse Loggers, the minor-league baseball team that he owns, I wasn’t certain whether he was referring to “logger” (as in timber cutter), or “lager” (as in beer)—either one of which would have been an apropos nickname for a ball club from Wisconsin. At the same time, his plainspokenness helped to overcome what he might have lacked in charm—and he had consistently won his State Senate seat in a district that ordinarily voted Democratic.39

  Wasserman, however, takes something of a poker player’s approach to his interviews. He is stone-faced and unfailingly professional, but he is subtly seeking to put the candidate under some stress so that that they might reveal more information to him.

  “My basic technique,” he told me, “is to try to establish a comfortable and friendly rapport with a candidate early on in an interview, mostly by getting them to talk about the fuzzy details of where they are from. Then I try to ask more pointed questions. Name an issue where you disagree with your party’s leadership. The goal isn’t so much to get them to unravel as it is to get a feel for their style and approach.”

  His interview with Kapanke followed this template. Wasserman’s knowledge of the nooks and crannies of political geography can make him seem like a local, and Kapanke was happy to talk shop about the intricacies of his district—just how many voters he needed to win in La Crosse to make up for the ones he’d lose in Eau Claire. But he stumbled over a series of questions on allegations that he had used contributions from lobbyists to buy a new set of lights for the Loggers’ ballpark.40

  It was small-bore stuff; it wasn’t like Kapanke had been accused of cheating on his wife or his taxes. But it was enough to dissuade Wasserman from changing the rating.41 Indeed, Kapanke lost his election that November by about 9,500 votes, even though Republicans won their races throughout most of the similar districts in the Midwest.

  This is, in fact, the more common occurrence; Wasserman will usually maintain the same rating after the interview. As hard as he works to glean new information from the candidates, it is often not important enough to override his prior take on the race.

  Wasserman’s approach works because he is capable of evaluating this information without becoming dazzled by the candidate sitting in front of him. A lot of less-capable analysts would open themselves to being charmed, lied to, spun, or would otherwise get hopelessly lost in the narrative of the campaign. Or they would fall in love with their own spin about the candidate’s interview skills, neglecting all the other information that was pertinent to the race.

  Wasserman instead considers everything in the broader political context. A terrific Democratic candidate who aces her interview might not stand a chance in a district that the Republican normally wins by twenty points.

  So why bother with the candidate interviews at all? Mostly, Wasserman is looking for red flags—like the time when the Democratic congressman Eric Massa (who would later abruptly resign from Congress after accusations that he sexually harassed a male staffer) kept asking Wasserman how old he was. The psychologist Paul Meehl called these “broken leg” cases—situations where there is something so glaring that it would be foolish not to account for it.42

  Catching a few of these each year helps Wasserman to call a few extra races right. He is able to weigh the information from his interviews without overweighing it, which might actually make his forecasts worse. Whether information comes in a quantitative or qualitative flavor is not as important as how you use it.

  It Isn’t Easy to Be Objective

  In this book, I use the terms objective and subjective carefully. The word objective is sometimes taken to be synonymous with quantitative, but it isn’t. Instead it means seeing beyond our personal biases and prejudices and toward the truth of a problem.43

  Pure objectivity is desirable but unattainable in this world. When we make a forecast, we have a choice from among many different methods. Some of these might rely solely on quantitative variables like polls, while approaches like Wasserman’s may consider qualitative factors as well. All of them, however, introduce decisions and assumptions that have to be made by the forecaster. Wherever there is human judgment there is the potential for bias. The way to become more objective is to recognize the influence that our assumptions play in our forecasts and to question ourselves about them. In politics, between our ideological predispositions and our propensity to weave tidy narratives from noisy data, this can be especially difficult.

  So you will need to adopt some different habits from the pundits you see on TV. You will need to learn how to express—and quantify—the uncertainty in your predictions. You will need to update your forecast as facts and circumstances change. You will need to recognize that there is wisdom in seeing the world from a different viewpoint. The more you are willing to do these things, the more capable you will be of evaluating a wide variety of information without abusing it.

  In short, you will need to learn how to think like a fox. The foxy forecaster recognizes the limitations that human judgment imposes in predicting the world’s course. Knowing those limits can help her to get a few more predictions right.

  3

  ALL I CARE ABOUT IS W’S AND L’S

  The Red Sox were in a very bad mood. They had just returned from New York, where they had lost all three games of a weekend series to the hated Yankees, ending their chances to win the 2009 American League East title. With only seven games left in the regular season, the Red Sox were almost certain to make the playoffs as the American League’s wild card,* but this was not how the organization wanted to go into the postseason. Statistical studies have shown that the way a team finishes the regular season has little bearing on how they perform in the playoffs,1 but the Red Sox were starting to sense that it was not their year.

  I was at Fenway Park to speak to one person: the Red Sox’s star second baseman, Dustin Pedroia. Pedroia had been one of my favorite players since 2006, when PECOTA, the projection system that I developed for the organization Baseball Prospectus, had predicted that he would become one of the best players in baseball. PECOTA’s prediction stood sharply in contrast to the position of many scouts, who dismissed Pedroia as “not physically gifted,”2 critiquing his short stature and his loopy swing and concluding that he would be a marginal player. Whereas PECOTA ranked Pedroia as the fourth best prospect in baseball in 2006,3 the publication Baseball America, which traditionally gives more emphasis to the views of scouts, put him at seventy-seventh. Instead, reports like this one (filed by ESPN’s Keith Law4 early in Pedroia’s rookie season) were typical.5

  Dustin Pedroia doesn’t have the strength or bat speed to hit major-league pitching consistently, and he has no power. If he can continue to hit .260 or so, he’ll be useful, and he probably has a future as a backup infielder if he can stop rolling over to third base and shortstop.

  Law published that comment on May 12, 2007, at which point Pedroia was hitting .247 and had just one home run.6 Truth be told, I was losing my faith too; I had watched most of his at-bats and Pedroia looked overmatched at the plate.*

  But almost as though he wanted to prove his doubters wrong, Pedroia started hitting the tar out of the baseball. Over the course of his next fifteen games, he hit a remarkable .472, bringing his batting average, which had droppe
d to as low as .158 in April, all the way up to .336.

  In July, two months after Law’s report, Pedroia made the American League All-Star Team. In October, he helped the Red Sox win only their second World Series since 1918. That November he was named Rookie of the Year. The next season, at the age of twenty-four, Pedroia took the Most Valuable Player award as the American League’s best all-around performer. He wasn’t a backup infielder any longer but a superstar. The scouts had seriously underestimated him.

  I had come to Fenway because I wanted to understand what made Pedroia tick. I had prepared a whole list of questions, and the Red Sox had arranged a press credential for me and given me field-level access. It wasn’t going to be easy, I knew. A major-league playing field, which players regard as their sanctuary, is not the best place to conduct an interview. The Red Sox, coming off their losing weekend, were grumpy and tense.

  As I watched Pedroia take infield practice, grabbing throws from Kevin Youkilis, the team’s hulking third baseman, and relaying them to his new first baseman, Casey Kotchman, it was clear that there was something different about him. Pedroia’s actions were precise, whereas Youkilis botched a few plays and Kotchman’s attention seemed to wander. But mostly there was his attitude: Pedroia whipped the ball around the infield, looking annoyed whenever he perceived a lack of concentration from his teammates.

  After about fifteen minutes of practice, the Red Sox left the infield to the Toronto Blue Jays, their opponents that evening. Pedroia walked past me as I stood on the first-base side of the infield, just a couple of yards from the Red Sox’s dugout. The scouts were right about his stature: Pedroia is officially listed at five feet nine—my height if you’re rounding up—but I had a good two inches on him. They are also right about his decidedly nonathletic appearance. Balding at age twenty-five, Pedroia had as much hair on his chin as on his head, and a little paunch showed through his home whites. If you saw him on the street, you might take him for a video rental clerk.

 

‹ Prev