by Nate Silver
All of Heyward’s comparables were big, strong, multitalented athletes who were high draft picks and displayed precocious skills in the minor leagues. But they met radically different fortunes. PECOTA’s innovation was to acknowledge this by providing a range of possible outcomes for each player, based on the precedents set by his comparables: essentially, best-case, worst-case, and most-likely scenarios. An endless array of outcomes can and will happen when we are trying to predict human performance.
So far, things have been up and down for Heyward. After a terrific 2009 in which he was named the Minor League Player of the Year, he hit eight home runs in his first thirty major-league games upon making his debut with the Braves in 2010 and made the All-Star team, exceeding all expectations. His sophomore season in 2011 was rougher, however, and he hit just .227. A good statistical forecasting system might have found some reason to be optimistic after Heyward’s 2011 season: his numbers were essentially the same except for his batting average, and batting average is subject to more luck than other statistics.
But can statistics tell you everything you’ll want to know about a player? Ten years ago, that was the hottest topic in baseball.
Can’t We All Just Get Along?
A slipshod but nevertheless commonplace reading of Moneyball is that it was a story about the conflict between two rival gangs—“statheads” and “scouts”—that centered on the different paradigms that each group had adopted to evaluate player performance (statistics, of course, for the statheads, and “tools” for the scouts).
In 2003, when Moneyball was published, Michael Lewis’s readers would not have been wrong to pick up on some animosity between the two groups. (The book itself probably contributed to some of the hostility.) When I attended baseball’s Winter Meetings that year at the New Orleans Marriott, it was like being back in high school. On one side were the jocks, who, like water buffaloes at an oasis, could sometimes be found sipping whiskey and exchanging old war stories at the hotel bar. More often they sequestered themselves in their hotel rooms to negotiate trades with one another. These were the baseball lifers: mostly in their forties and fifties, many of them former athletes, they had paid their dues and gradually worked their way up through the organizational hierarchy. On the other side were the nerds: herds of twenty- and thirtysomethings, armed with laptop bags and color-printed position papers, doing lap after lap around the lobby hoping to pin down a jock and beg him for a job. There wasn’t much interaction between the two camps, and each side regarded the other as arrogant and closed-minded.
The real source of conflict may simply have been that the jocks perceived the nerds as threats to their jobs, usually citing real or perceived cuts in scouting budgets as their evidence. “It is adversarial right now,” Eddie Bane, the scouting director of the Anaheim Angels told Baseball America in a contentious roundtable at the conference that focused on Moneyball.19 “Some of our old-time guys are losing jobs that we didn’t feel they should be losing. Maybe the cutbacks were due to money or whatever. But we correlate it to the fact that some of the computer stuff is causing that. And we resent it.”
How many teams had really cut their scouting budgets is unclear. The Toronto Blue Jays were one team that did so, and they paid the price for it, with a series of poor drafts between 2002 and 2005. But the budget cuts were forced by the peculiarities of their corporate parent, Rogers Communications, which was struggling with a weak Canadian dollar, and not the whim of their then general manager, the Beane disciple J. P. Ricciardi.
It’s now been a decade since the publication of Moneyball, however, and these brushfires have long since burned themselves out. The success of the Red Sox, who won their first World Series title in eighty-six years in 2004 with a fusion approach that emphasized both statistics and scouting, may have been a key factor in the détente. Organizations that would have been classified as “scouting” organizations in 2003, like the St. Louis Cardinals, have since adopted a more analytic approach and are now among the most innovative in the sport. “Stathead” teams like the Oakland A’s have expanded rather than contracted their scouting budgets.20
The economic recession of 2007 through 2009 may have further encouraged the use of analytic methods. Although baseball weathered the recession fairly well, suddenly everyone had become a Moneyball team, needing to make the best use possible of constrained budgets.21 There was no shortage of cheap stathead labor: economics and computer science graduates from Harvard and Yale, who might otherwise have planned on a $400,000-a-year job at an investment bank, were willing to move to Tampa or Cleveland and work around the clock for one-tenth of that salary. The $40,000 nerd was a better investment than a $40 million free agent who was doomed to see his performance revert to the mean.
It has not, however, been a unilateral victory for the statheads. If the statheads have proved their worth, so have the scouts.
PECOTA Versus Scouts: Scouts Win
PECOTA originally stood for Pitcher Empirical Comparison and Optimization Test Algorithm: a clunky acronym that was designed to spell out the name of Bill Pecota, a marginal infielder with the Kansas City Royals during the 1980s who was nevertheless a constant thorn in the side of my favorite Detroit Tigers.*
The program had initially been built to project the performance of pitchers rather than hitters. The performance of pitchers is notoriously hard to predict, so much so that after experimenting for a couple of years with a system called WFG—you can guess what the acronym stands for—Baseball Prospectus had given up and left their prediction lines blank. I sensed an opportunity and pitched PECOTA to Huckabay. Somewhat to my surprise, he and the Baseball Prospectus crew were persuaded by it; they offered to purchase PECOTA for me in exchange for stock in Baseball Prospectus on the condition that I developed a similar system for hitters.22 I did so, and the first set of PECOTA projections were published the next winter in Baseball Prospectus 2003.
When the 2003 season was over, we discovered that PECOTA had performed a little better than the other commercial forecasting systems.23 Indeed, each year from 2003 through 2008, the system matched or bettered the competition every time that we or others tested it,24 while also beating the Vegas over-under lines.25 There were also some fortuitous successes that boosted the system’s reputation. In 2007, for instance, PECOTA predicted that the Chicago White Sox—just two years removed from winning a World Series title—would finish instead with just seventy-two wins. The forecast was met with howls of protest from the Chicago media and from the White Sox front office.26 But it turned out to be exactly right: the White Sox went 72-90.
By 2009 or so, however, the other systems were catching up and sometimes beating PECOTA. As I had borrowed from James and Huckabay, other researchers had borrowed some of PECOTA’s innovations while adding new wrinkles of their own. Some of these systems are very good. When you rank the best forecasts each year in terms of how well they predict the performance of major league players, the more advanced ones will now usually come within a percentage point or two of one another.27
I had long been interested in another goal for PECOTA, however: projecting the performance of minor league players like Pedroia. This is potentially much harder. And because few other systems were doing it until recently, the only real competition was the scouts.
In 2006, I published a list of PECOTA’s top 100 prospects for the first time, comparing the rankings against the scouting-based list published at the same time by Baseball America. The players in the PECOTA list were ranked by how much value they were expected to contribute over the next six seasons once they matriculated to the major leagues.28
The 2011 season marked the sixth year since the forecasts were issued, so I was finally able to open up the time capsule and see how well they performed. Although the players on this list are still fairly young, we should have a pretty good idea by now of whether they are stars, benchwarmers, or burnouts.
This list did have Pedroia ranked as the fourth best prospect in baseball. And there were o
ther successes for PECOTA. The system thought highly of the prospect Ian Kinsler, whom Baseball America did not have ranked at all; he has since made two All-Star teams and has become one of the cogs in the Texas Rangers’ offense. PECOTA liked Matt Kemp, the Dodgers superstar who nearly won baseball’s elusive Triple Crown in 2011, better than Baseball America did.
But have you ever heard of Joel Guzman? Donald Murphy? Yusemiro Petit? Unless you are a baseball junkie, probably not. PECOTA liked those players as well.
Baseball America also had its share of misses: the scouts were much too optimistic about Brandon Wood, Lastings Milledge, and Mark Rogers. But they seemed to have a few more hits. They identified stars like the Red Sox pitcher Jon Lester, the Rockies’ shortstop Troy Tulowitzki, and the Baltimore Orioles outfielder Nick Markakis, all of whom had middling minor-league statistics and whom PECOTA had not ranked at all.
There is enough data to compare the systems statistically. Specifically, we can look at the number of wins the players on each list generated for their major-league teams in the form of a statistic called wins above replacement player, or WARP,29 which is meant to capture all the ways that a player contributes value on the baseball diamond: hitting, pitching, and defense.
The players in the PECOTA list had generated 546 wins for their major-league teams through 2011 (figure 3-3). But the players in Baseball America’s list did better, producing 630 wins. Although the scouts’ judgment is sometimes flawed, they were adding plenty of value: their forecasts were about 15 percent better than ones that relied on statistics alone. That might not sound like a big difference, but it really adds up. Baseball teams are willing to pay about $4 million per win on the free-agent market.30 The extra wins the scouts identified were thus worth a total of $336 million over this period.*
The Biases of Scouts and Statheads
Although it would have been cool if the PECOTA list had gotten the better of the scouts, I didn’t expect it to happen. As I wrote shortly after the lists were published:31
As much fun as it is to play up the scouts-versus-stats angle, I don’t expect the PECOTA rankings to be as accurate as . . . the rankings you might get from Baseball America.
The fuel of any ranking system is information—and being able to look at both scouting and statistical information means that you have more fuel. The only way that a purely stat-based prospect list should be able to beat a hybrid list is if the biases introduced by the process are so strong that they overwhelm the benefit.
In other words, scouts use a hybrid approach. They have access to more information than statistics alone. Both the scouts and PECOTA can look at what a player’s batting average or ERA was; an unbiased system like PECOTA is probably a little bit better at removing some of the noise from those numbers and placing them into context. Scouts, however, have access to a lot of information that PECOTA has no idea about. Rather than having to infer how hard a pitcher throws from his strikeout total, for instance, they can take out their radar guns and time his fastball velocity. Or they can use their stopwatches to see how fast he runs the bases.
This type of information gets one step closer to the root causes of what we are trying to predict. In the minors, a pitcher with a weak fastball can rack up a lot of strikeouts just by finding the strike zone and mixing up his pitches; most of the hitters he is facing aren’t much good, so he may as well challenge them. In the major leagues, where the batters are capable of hitting even a ninety-eight-mile-per-hour fastball out of the park, the odds are against the soft-tosser. PECOTA will be fooled by these false positives while a good scout will not be. Conversely, a scout may be able to identify players who have major-league talent but who have yet to harness it.
To be sure, whenever human judgment is involved, it also introduces the potential for bias. As we saw in chapter 2, more information actually can make matters worse for people who take the wrong attitude toward prediction and use it as an excuse to advance a partisan theory about the way that the world is supposed to work—instead of trying to get at the truth.
Perhaps in the pre-Moneyball era, these biases were getting the better of the scouts. They may have been more concerned about the aesthetics of a player—did he fill out his uniform in the right way?—than about his talent. If recent Baseball America lists have been very good, the ones from the early 1990s32 were full of notorious busts—highly touted prospects like Todd Van Poppel, Ruben Rivera, and Brien Taylor who never amounted to much.
But statheads can have their biases too. One of the most pernicious ones is to assume that if something cannot easily be quantified, it does not matter. In baseball, for instance, defense has long been much harder to measure than batting or pitching. In the mid-1990s, Beane’s Oakland A’s teams placed little emphasis on defense, and their outfield was manned by slow and bulky players, like Matt Stairs, who came out of the womb as designated hitters. As analysis of defense advanced, it became apparent that the A’s defective defense was costing them as many as eight to ten wins per season,33 effectively taking them out of contention no matter how good their batting statistics were. Beane got the memo, and his more recent and successful teams have had relatively good defenses.
These blind spots can extract an even larger price when it comes to forecasting the performance of minor-league players. With an established major-league player, the question is essentially whether he can continue to perform as he has in the past. An especially clever statistical forecasting system might be able to divine an upward or downward trend of a few percentage points.34 But if you simply assume that the player will do about as well next season as he has in his past couple, you won’t be too far off. Most likely, his future capability will not differ that much from his present output.
Baseball is unique among the major professional sports, however, for its extremely deep minor-league system. Whereas the National Football League has no officially sanctioned minor league, and the NBA has just a few minor-league teams, baseball has 240, eight for each major-league parent. Moreover, whereas basketball or football players can jump from college or even high school straight into the pros and be impact players immediately upon arrival, this kind of instant stardom is extremely rare in baseball. Even the most talented draft picks may have to bide their time in Billings or Bakersfield or Binghamton before advancing to the major leagues.
It is very challenging to predict the performance of these players because we are hoping that they will eventually be able to do something that they are not capable of at present: perform at a high level in the major leagues. Save for a literal once-in-a-generation prospect like Bryce Harper, the best high school hitter in the country would get killed if he had to face off against major-league pitching. He will have to get bigger, stronger, smarter, and more disciplined in order to play in the majors—all of which will require some combination of hard work and good fortune. Imagine if you walked into an average high school classroom, got to observe the students for a few days, and were asked to predict which of them would become doctors, lawyers, and entrepreneurs, and which ones would struggle to make ends meet. I suppose you could look at their grades and SAT scores and who seemed to have more friends, but you’d have to make some pretty wild guesses.
And yet amateur scouts (and any statistical system designed to emulate them) are expected to do exactly this. Although some baseball players are drafted out of college, many others come straight from high school, and the scouting process can begin when they’re as young as their midteens. Like any group of young men, these players will be full of hormones and postadolescent angst, still growing into their bodies, dealing with the temptations of booze and the opposite sex. Imagine if you had to entrust the future of your business to a set of entitled nineteen-year-olds.
Beyond the Five Tools
As Lewis described in Moneyball, Billy Beane was one of those players who had prodigious talent but failed to realize it; a first-round draft pick in 1980, he played just 148 games in the majors and hit .219 for his career. Beane had a Hall
of Fame career, however, compared with prospects like John Sanders, who is now a scout for the Los Angeles Dodgers.
Sanders once played in the major leagues. Exactly once—like Moonlight Graham from Field of Dreams. On April, 13, 1965, when Sanders was nineteen, the Kansas City Athletics used him as a pinch-runner in the seventh inning of a game against the Detroit Tigers. Sanders didn’t so much as advance a base: the last two hitters popped out, and he was replaced before the next inning began.35 He would never play in the majors again.
Sanders did not lack talent. He had been a multisport star at Grand Island High School in Nebraska: All-State quarterback in 1963, All-State basketball in 1964, a gold-medal winner in the discus at the state track meet.36 Baseball might not even have been his best sport. But he was darned good at it, and when he graduated in the summer of 1964, he had a professional contract from the A’s to accompany his diploma.
But Sanders’s development was stymied by something called the Bonus Baby rule. Before the introduction of the major-league draft, in 1965, all amateur players were free agents and teams could pay them whatever they wanted. To prevent the wealthiest teams from hoarding the talent, the rule extracted a punishment—players who received a large signing bonus were required to spend their first two professional seasons on the major-league roster, even though they were nowhere near ready to play at that level.37
The rule really punished bright prospects like Sanders. Most of the Bonus Babies spent their time riding the bench, rarely seeing any big-league action. They were shut out from getting everyday game experience at the very time they needed it the most. Fans and teammates, wondering why some peach-fuzzed nineteen-year-old was being paid thousands to be a glorified batboy, were unlikely to be sympathetic to their plight. Although a few Bonus Babies like Sandy Koufax and Harmon Killebrew went on to have Hall of Fame careers, many other talented prospects of the era never overcame the experience.