Smart Baseball

Home > Other > Smart Baseball > Page 18
Smart Baseball Page 18

by Keith Law


  If the idea of stealing strikes sounds wrong or jarring to you, well, you’re far from alone. There’s no other situation on the field where we accept tricking the umps as a part of the game, much less as a player skill worth compensating. If an outfielder were discovered to have a special ability to trap balls in play so that they appeared to be caught on the fly for outs, we wouldn’t stand for this. We don’t allow pitchers to doctor the baseball to make it move in unusual ways—a matter of player safety as well as of fairness. But the idea of a good pitch-framer is well ingrained in the game, so when analysts like Turkenkopf and later Mike Fast of Baseball Prospectus (and now the Houston Astros) started to put large values on the best and worst framing performances, no one batted an eye.

  You’ll hear various maxims around baseball like “the best pitch in baseball is strike one” or that “the most important pitch is the one-one pitch,” the latter referring to the pitch thrown when the count on the batter is one ball and one strike. There’s some real truth in there, despite their pithy nature, because the expected outcome of an at bat shifts dramatically with the count. When the first pitch of an at bat was a strike, hitters in 2016 hit .223/.266/.352, and when the first pitch was a ball, hitters hit .271/.382/.457. When the 1-1 pitch was a strike, hitters in 2016 hit .178/.229/.279, and when the 1-1 pitch was a ball, hitters hit .249/.386/.418. Those are enormous gaps. Catcher-framing stats work off those gaps, estimating the value of a ball called a strike (or vice versa) because of the way the catcher caught the pitch based on the count at which it occurred.

  Fast’s September 2011 piece at Baseball Prospectus, “Removing the Mask,” built on Turkenkopf’s work and showed that pitch-framing mattered in a huge way; Fast’s research, covering the seasons 2007 to 2011, showed Jose Molina’s framing was worth 35 extra runs to his team per 120 games caught (about a full season for a starting catcher), Jonathan Lucroy ranked second at 24 runs per 120 games, and the worst catchers in that span cost their teams about 25 runs per 120 games caught. Just going from a league-worst framer to an average one would be worth about two and a half wins to your team, using the rough guideline of about ten runs scored or prevented adding up to a win of value. (That equivalency will vary over time, depending on the run-scoring environment, but it’s not going to vary enough to change the conclusion here that framing matters.)

  Fast’s research landed him a job with Houston and changed the market for valuing catchers. Catchers who could frame were suddenly in demand, and catchers who couldn’t saw their markets start to shrink. The Tampa Bay Rays acquired Molina, a truly awful hitter, from Toronto, and he caught in 281 games for them over the next three years despite a .213/.271/.286 triple-slash line. Over that span, his framing was worth 70 runs prevented to Tampa Bay, according to Baseball Prospectus’s framing statistics.

  Tampa Bay’s GM, Andrew Friedman, left the Rays to become the president of baseball operations for the Los Angeles Dodgers after the 2014 season, and hired Farhan Zaidi, formerly an assistant GM for the Oakland A’s who was also in charge of much of the team’s statistical research, to be the Dodgers’ new GM. One of their most significant moves in their first off-season at the helm of the Dodgers was a massive trade with the San Diego Padres that included catcher Yasmani Grandal, who had worn out his welcome with Padres pitchers for his poor receiving and game-calling. Grandal happened to be an outstanding pitch-framer, and his framing saved the Dodgers 25.6 runs in 2015 and 27.5 runs in 2016, also according to BP’s framing stats.

  Baseball Prospectus’s Harry Pavlidis and Jonathan Judge have continued to improve their framing metrics, introducing a new version before the 2014 season that they called RPM framing, with RPM standing for “regressed, probabilistic model.” This is heavier-weight statistical work that smooths out the data by looking at the probability of each pitch type in each location being called a strike, taking umpire tendencies on those pitches into account, and then regressing those career totals to the league average, a statistical technique designed to reduce noise or randomness in the sample.

  BP’s results at that time weren’t too surprising; for the period 2008–13, the best framers were Brian McCann, Jose Molina, and Jonathan Lucroy, while the worst were Ryan Doumit, Gerald Laird, and Chris Iannetta. Doumit retired after 2014 and hadn’t caught even half of his team’s games since 2010, while Laird played just one game in 2015 before calling it quits. Iannetta continues to get opportunities to play regularly behind the plate as of this writing, although for this I can offer no explanation. BP further updated their model before the 2016 season and it remains the best framing metric available in the public sphere.

  Pavlidis and Judge continued their work on quantifying the value of catcher defense with a pair of presentations at the annual sabermetrics conference called Saberseminar, held every year in mid-August in Boston. Their work looked at the question of catcher game-calling (which would include calling pitch types and locations) and whether we might be able to infer any such skill from looking at pitch-by-pitch data for various catchers while they were behind the plate. This was a preliminary look and it appeared that the impact that they found was much smaller than that of pitch-framing, but it also demonstrates how the torrent of new data available to teams and analysts is changing the way we look at the game. Now we can question conventional wisdom—perhaps refuting it, but also perhaps confirming it—in ways that were impossible a decade earlier.

  And in the end, the same can be said, not just for catching stats but for defensive stats as a whole. Fielding was long a total unknown for analysts and front office executives trying to ascertain the true value of a player’s performance, and then it gradually improved as better play-by-play data became available for the majors and eventually for the minors as well. Analysts still had to make some estimates, so these metrics never had the level of accuracy we expect from, say, batting statistics; we’re estimating how often a fielder makes a certain play, how many runs the play would be worth if it weren’t made, and even things like how hard the ball was hit or where the fielders involved were standing at the start of the play.

  While the data are getting better with the help of MLB, many of the emerging defensive metrics remain proprietary. Several analysts told me that UZR and dRS represent the best fielding metrics available to the lay public, so I’ll continue to use them to talk about player values. But the divergence between what teams know and what is available to those of us in the public is emblematic of the rise of Big Data within baseball; valuing fielding has been seen as a sort of Hilbert’s problem in baseball for several decades, a critical question to answer if teams wanted to value players accurately, but a question in search of the right data to allow anyone to solve it. This remains an ongoing effort within front offices, one that has accelerated with the advent of new, precise data on fielder positioning—that is, where everyone’s standing when the play starts—that’s available to all thirty teams but, alas, not yet available to the public.

  The knowledge gap is also emblematic of a greater gap between what the average fan can follow. In many ways, baseball is becoming a more technical sport to understand. It’s still a simple game to watch and enjoy, but as you’ll see later, in Part Three, teams are now advancing in data analysis at a speed we haven’t previously seen in the sport, introducing not just new metrics but entirely new ways to think about player performance.

  14

  No Puns Intended:

  Going to WAR to Value the Whole Player

  If you’ve been confused by recent debates about baseball players—whether it’s trades or MVP Awards or the Hall of Fame—that refer to stats like WAR (Wins Above Replacement), this is the chapter for you. That doesn’t mean this chapter is about WAR specifically, and it’s certainly not going to be a defense of the construct—WAR itself isn’t a stat, but a way of putting other stats together—but it is going to explain why we might want to look at players in the way that the various flavors of WAR let us look at players. And if you want to understand these de
bates, or think about players the way front offices do, you need to understand how we even got to this point, where WAR could dominate debates over player value even when we don’t agree on what the best WAR is.

  When I say that WAR isn’t a stat, but a construct, I’m trying to get a couple of pieces of information across in the most concise fashion I can. Measuring player value is an idea. There are many ways to implement this idea, depending on how you measure the individual parts of a player performance, how you weigh them against one another, and what adjustments you include for the player’s environment or other outside factors. WAR has become like Kleenex here: Kleenex isn’t the only kind of tissue, and WAR isn’t the only way to measure a player’s total value, but the terms have lost their specific meanings over time.

  Wins Above Replacement means roughly what it says: this player produced this many wins of total value above a replacement-level (baseline) player. It doesn’t tell you how to measure any of that. You can use whatever formula for wins you want, and whatever kind of replacement-level player calculation you want, and call the difference WAR. You could try to calculate a pitcher’s WAR with his win total; you’d get ridiculous results, but as long as you compared it to a replacement level, well, that’s kind of Wins Above Replacement. (I threw up in my mouth as I typed that.) You have to compare the player’s production to something, and that something isn’t zero; it can be an average, but the overwhelming preference of analysts and sabermetricians is to compare it to replacement level, a number based off the average that represents the “free” talent available to teams in the form of triple-A players. The value a player generates above that free talent is the value he delivers to his team, in wins or, ultimately, in dollars.

  If someone says to you they don’t like WAR or use WAR, or they think WAR is a garbage stat, they’re telling you they don’t understand WAR. WAR is a construct, a bare-bones blueprint for comparing a player’s total value to an objective baseline level tied to playing time. If you ask analysts working for MLB teams how they calculate the total value of a player’s production, the answers you get will sound an awful lot like WAR. Their calculations are more precise than the public ones, but the core concept is the same.

  Set WAR aside for a few moments and just imagine you’re the general manager of a Major League Baseball team. It’s November 1, and free agency is about to start. Your owner has told you that you have $30 million in extra budget to spend to acquire new players this off-season, by whatever means you choose. That’s over and above what’s already committed to players currently in your organization. What should your goal be?

  If you said “to make your team better,” well, duh. That’s always the goal. But the goal this time around has a constraint, too: you have a fixed amount of room in your budget. You could increase that by trading away existing players, but you’d also lose their contributions. However you look at the problem, you have a boundary on how much you can do.

  So instead, your goal has to be more specific, acknowledging that constraint explicitly. I would phrase the goal as “to improve the team as much as possible given the maximum payroll,” or, even more specifically, “to add as many wins to the team in the next season as possible,” assuming that you’re not trying to rebuild, of course. You want to add players who’ll add wins. You want to add the best players $30 million can buy you. And that probably means buying players, whether it’s via free agency or trade, who’ll be worth more than the price you pay to acquire them. You want to pay a player $10 million and have him be worth $20 million, or $25 million, or more.

  How do we determine the dollar value of a player? It’s simple math, even though I’m sure some of you are already getting all hoity-toity over the idea of putting a dollar figure on a man’s worth. It’s nothing personal, just business. You need to figure out how much baseball value he’s likely to produce, and then figure out how much that value is worth to your team, in dollars. The second part of that is always going to be team-specific: the marginal value of a win—that is, the increased revenue a team can expect from one more win on the field—depends on the team’s market, elasticity of its attendance, place in the standings, likelihood to make the playoffs (the single most valuable win you can get is the one that puts your team into the playoffs), and so on. We can’t really know that part as fans.

  But we can know, or more correctly estimate, the value of a player’s total production by valuing all of the individual things he does on the field and then adding all of those values together. For a position player, that would mean valuing his batting contributions, his baserunning contributions, and his fielding contributions. For a catcher, we should probably also consider his pitch-framing—his ability to steal strikes by how he receives borderline pitches, a strange way to contribute value but one that is real and quite significant—although some aspects of catching, like game-calling, remain difficult or impossible to quantify.

  Analysts also consider the value of a position player relative to his position, because the average shortstop does not hit as well as the average left fielder or designated hitter. It is easier to find a player to play a position on that latter end of the defensive spectrum than a player on the end that includes the more demanding positions. Exact position values will fluctuate slightly year to year, but I think of the spectrum this way:

  Positions Hardest to Easiest

  SS, C

  CF

  2B, 3B

  RF

  1B

  LF

  DH

  You can compare a player to the “replacement level” for his position, which is what WAR does, or you can compare the player to the average for his position, as long as you bear in mind that an average player is actually still quite valuable because somewhere between a third and half of the teams probably get less than average production from that spot. (That’s the mathematical average, as opposed to the median, above or below which we’d find exactly half of the teams.) Before we can even do this simple comparison, though, we have to decide how to value everything a hitter does at the plate.

  What is a home run worth?

  It seems like a simple question, with a simple answer: A home run is worth one run. If a player hits a home run, he scores, every time, without fail. That’s worth one run.

  But the home run also has the power to score every runner who’s already on base at the time that it’s hit. That could be zero runners, one, two, or three. Clearly a home run can, in many situations, be worth more than one run, because it can drive in a runner from any base. A runner on first isn’t that likely to score in the abstract, but a home run ensures that he does.

  Figuring out what a home run is worth is part of the larger way that analysts measure total offense. Each plate appearance for a hitter is a discrete event, and each event outcome—an out, a hit, a walk, and so on—has a specific value. If you assign a value to each outcome for a hitter and add them all up, factoring in the ballparks where the hitter played, you get a total value for his offensive contributions.

  The method for doing this is known as “linear weights”: you take the weight (value) of each outcome and add them all up, linearly, without fancy multiplicative effects or exponents or other I-was-promised-there-would-be-no-math voodoo. It is quite simple, using a formula that gives us a form of hitter value in runs:

  Batting Runs = value of a single * number of singles PLUS

  value of a double * number of doubles PLUS

  value of a triple * number of triples PLUS

  value of a HR * number of HR PLUS

  value of a walk * number of walks PLUS

  value of a HBP * number of HBP MINUS

  value of an out * number of outs made

  Some formulas also add in weights for stolen bases and times caught stealing, while others separate that into a separate formula for baserunning, but the concept remains the same. We’re trying to value a player’s offensive production by valuing all the individual things he does and then adding them up. You can also
adjust the resulting numbers to reflect the park in which the hitter played his home games—hitting 30 home runs at the launching pad of Coors Field is not as valuable or as difficult as hitting 30 at pitcher-friendly Petco Park, so an advanced offensive metric should reflect that. It doesn’t have to be more complicated than this because making it more complicated doesn’t give us any more accuracy.

  Statistician George Lindsey was the first to publish a method of valuing offense like this, way back in 1963 in the academic journal Operations Research, in a paper titled “An Investigation of Strategies in Baseball,” a paper that had no known effect on MLB at the time (shocker) but did prove highly influential to the generation of baseball analysts and writers outside the industry. These included Bill James, Steve Mann, and John Thorn and Pete Palmer, the latter two of whom brought their linear-weights formula to a wider audience in the seminal 1984 book The Hidden Game of Baseball. (The book is so important that I jumped at the chance to write the foreword when the University of Chicago Press reissued it in 2015.) Palmer and Thorn later used their formulas in the Total Baseball encyclopedias, which were published from 1989 to 2004.

  The Palmer/Thorn linear weights formula, called Batting Runs, derives from Palmer’s simulations of MLB games from 1901 to 1978, which must have been a significant undertaking given the technology available to him at the time. Their version was:

  Batting Runs = .46*1B + .80*2B + 1.02*3B + 1.40*HR + .33*BB + .33*HBP + .30*SB –0.60*CS –.25*(AB –H) –.50*(OOB)*

 

‹ Prev