Smart Baseball

Home > Other > Smart Baseball > Page 14
Smart Baseball Page 14

by Keith Law


  The answer is yes . . . and no. While it’s by no means a “new” stat, ERA has clear value in helping us understand what happened while the pitcher was on the mound, but it’s very noisy, meaning that there are a lot of other confounding factors that can cloud the “signal,” the indication of just how well that pitcher actually pitched. You’ll know a lot more about how well a pitcher performed if you look at ERA rather than won-lost record, but you’ll still only have a vague idea because of how ERA is calculated and because it assumes that the pitcher was solely responsible for those runs that he allowed (or didn’t allow).

  There are single metrics that encapsulate the total production of a hitter, whether as a rate (wOBA, wRC+) or as a total (Batting Runs or any other linear-weights stat). Offense, as the sabermetricians in the Bluth family might say, is a freebie: other than park effects there isn’t a ton of noise to clean up in a typical hitter’s line. We can talk about whether a hitter did something that he’s unlikely to repeat, but calculating the value of what he did is pretty straightforward.

  Pitching, on the other hand, is much harder to evaluate the same way, because pitching isn’t simply the converse of hitting. A hitter’s performance is an independent phenomenon, but the same can’t be said of a pitcher’s performance, which includes all kinds of outside influences that can’t be easily disentangled from the pitcher’s own contributions. The pitcher is part of a team’s run prevention, and so is the team’s defense, a relationship that has been at the heart of sabermetric advances and debates for the last fifteen years and that helped drive MLB to adopt new technology so teams can make more progress in this area. Very little of what a pitcher does is “clean” data; if he strikes a batter out or walks a batter, that’s just his responsibility, but when the ball is put into play the credit or blame is shared across many players.

  This itself is a break from past thinking. Before the year 2000, everyone assumed a pitcher was totally responsible for whether he gave up more or fewer hits. A groundbreaking study around that time showed that this was not the case, and parsing the responsibility for a hit has been a grail of sorts for analysts since then. It turns out that the defense behind a pitcher matters—the quality of the fielders and where they’re positioned affects whether a ball hit into play becomes a hit.

  Traditional pitcher stats also assign full blame for a run scoring on the pitcher who first put the runner on base. Smith walks a batter, is pulled for reliever Jones, who gives up a home run. Smith is charged with the first run, but didn’t Jones play some part in that? Exonerating the reliever who allows inherited runners to score is like accepting “the runner was already scoring when I got here” as an excuse. But splitting that up isn’t as easy as dividing the run in half or into quarters, either.

  The good news is that there are myriad ways to attack the question of how much a pitcher’s performance was worth. There are multiple new metrics that get at this question, most of which try to adjust the credit or blame to account for the influence of defense or other pitchers. And there are some old stats that, while imperfect, still contain useful information that we shouldn’t discard just because the stats are old. One stat in particular tells us less than we thought but still tells us something we want to know: earned run average, commonly known as ERA.

  First, let’s look just at what ERA is and what it purports to measure. ERA’s formula is fairly simple and, for the young me at least, a good way to make a baseball-obsessed kid practice things like multiplying by 9 and dividing double-digit numbers:

  ERA = Earned runs allowed * 9 / Innings pitched

  The idea is to produce a rate stat that shows how many earned runs the pitcher allowed per nine innings pitched, a concept that made perfect sense in an era where pitchers routinely threw complete games, but that still works today because we are very used to thinking in chunks of nine innings. It also creates an easy baseline to compare a pitcher to the league average for runs scored or allowed (same thing) per game. In a league where the average team scores 4 runs a game, a pitcher with a 4.50 RA (run average, slightly different from ERA for reasons I’ll get to a bit later) is worse than average; and a pitcher with a 3.50 RA is better.

  Earned runs allowed form a subset of total runs allowed; a pitcher can allow a run that is “unearned,” in the discretion of the official scorer, if the run scored as a result of a fielder’s misplay, either a fielding error or a passed ball. It is incredibly subjective, and can lead to absurd consequences in the case of runs allowed with two outs after an error that would have ended the inning.

  For example, on June 5, 1989, Yankees pitcher Andy Hawkins allowed 10 runs in 2+ innings of work against the Orioles, but his ERA went down because none of those runs was earned. In the first inning, Hawkins retired the first two batters, then allowed a double and two walks to load the bases. The next batter, Jim Traber, hit a flyball that center fielder Jesse Barfield misplayed, allowing all three runs to score. Even though Hawkins put all three runners on base himself, the runs were unearned because they scored on an error.

  In the third inning, things got ridiculous. The first three batters reached on fielding errors—one by Hawkins himself—allowing the first batter to score. I think we can all agree that dinging Hawkins for that run would be misleading, since he did his job as a pitcher, getting three groundballs that were not fielded cleanly. Hawkins didn’t retire another batter at the plate, allowing the next five batters to reach via a single, a double, an intentional walk, and two more singles, with three more runs crossing the plate. Reliever Chuck Cary came in, got a groundout, and then gave up a grand slam, which scored three more of Hawkins’s runners, for a total of ten runs charged to Hawkins . . . all of them unearned, because by the time Steve Finley hit that slam, there were two outs.

  Once there are two outs and the third out should have been made but wasn’t due to an error, every run that scores thereafter is unearned. Hawkins had some bad luck, but he also gave up four clean hits and a walk, which you cannot do without either allowing a run or getting two outs on the bases somewhere. For him to walk out of this with zero earned runs despite allowing eight baserunners in 2⅓ innings pitched is both bizarre and unhelpful if we’re trying to use ERA to gauge his performance on that day or on the season as a whole. (As if that weren’t enough, Hawkins tied for the American League lead with 111 earned runs allowed that year, and led the league with 127 total runs allowed, finishing with the fourth-worst ERA in the American League.)

  Hawkins’s 10 unearned runs allowed is tied with Tim Wakefield’s performance for the Red Sox on May 5, 1996, for the unofficial record for the second-most UER allowed in one game.* Wakefield’s case was more straightforward. In the fourth inning against Toronto, Wakefield allowed a single, a line drive misplayed by the third baseman, and a strikeout, so there was one out and a man on base who’d reached via an error. The first run scored on a passed ball, which is automatically unearned, although Wakefield, a knuckleballer, was especially prone to wild pitches (earned) and passed balls (not). The batter then singled home the runner who’d reached on an error, so that was unearned. After a walk and a flyout, there were men on first and third with two outs, but the hypothetical third out should have been recorded, making everything that came after “unearned” even though there were no more errors or passed balls in the inning.

  From there? Single, single, double, two-run home run. Six more runs scored, all unearned even though the runners who scored reached base safely via walk or hit and scored via walk or hit. Wakefield gave up one earned run in the sixth and left the game with two outs and men on first and second. The first batter faced by reliever Mike Maddux (yes, Greg’s brother) hit a flyball to right that was misplayed for a two-base error that scored both of the runners Wakefield allowed to reach base safely. That’s 11 runs allowed but just one earned run despite Wakefield allowing ten hits and five walks.

  This bit of selectivity, identifying some runs that count against a pitcher’s ERA and some that don’t, is also incred
ibly subjective because the decision on what constitutes an error belongs to one person, the official scorer, not an analyst, scout, or anyone who might at least bring a better background to the job. A fielder who fails to make a routine play but never touches the ball will almost never be charged with an error, even though the spirit of the earned run rule is to charge the pitcher only with runs he was responsible for allowing; if he got that groundball but the third baseman never touched it, it’s not an error, but we might say the pitcher did his job.

  ERA is of limited use when looking at starting pitchers, but it can really miss the mark for relievers, who each year throw about one-third the innings that starters do and frequently come into the game with men on base or leave the game with runners on base for the next guy to try to strand. The problem of inherited runners is one that baseball people have understood for a long time; stats such as Inherited Runners and Inherited Runners Scored have been available for twenty years, showing how many runners were on base when a reliever entered games and how many of them he allowed to cross the plate. Stranding runners is a good thing, but it isn’t quite a separate skill from just getting batters out in the first place; if a pitcher can pitch well from the stretch rather than the windup, he’ll strand lots of runners, and if he’s not very good from the stretch, well, he’s probably not going to last long enough in the big leagues for us to get very worked up about his strand rate.

  But inherited runners and their analogue, the bequeathed runner—one can imagine a Victorian-era baseball game in which a pitcher departing the game presented his successor with a document laying out exactly which runners were the property of the latter and limiting the ways in which he might dispose of them—pose the problem described earlier in the chapter of assignment of blame. A reliever who comes into a game with the bases loaded and promptly gives up a home run is charged with only one run allowed, even though four scored as a direct result of his actions. And just imagine how that pitcher who walked off the mound having put a runner on every base felt as he watched the home run leave the yard; he’d probably think a somewhat less family-friendly version of “Ye gods, there goes my ERA!”

  Distilling the reliever’s performance is not fundamentally different from distilling a starter’s performance; any pitcher’s job is to get outs, preferably via strikeouts (because they can’t advance any runners or be misplayed), while avoiding walks and home runs. If all you do is strike guys out, you’re not going to let any inherited runners score, in case that wasn’t already blindingly obvious to you. A reliever who does these things will be a Good Reliever, and may even grow up to be a Proven Closer™, which is worth a Lot of Money. Current relievers Aroldis Chapman, Craig Kimbrel, Kenley Jansen, and Dellin Betances have posted some of the best strikeout rates in baseball history, and, not coincidentally, they are Good Relievers, with the first three also becoming Proven Closers™ in recent years.

  The problem of inherited and bequeathed runners makes ERA especially dicey when looking at a single season for a reliever. A full year of pitching for a reliever might only include 60 innings, so a swing of three runs allowed—say, one day when the reliever left a game with the bases loaded and two outs, only to have the next reliever give up a grand slam—will raise his ERA for the entire year by 0.45. A difference of seven runners bequeathed, scoring or not scoring, would mean over a run in the pitcher’s ERA for the year, even though the pitcher himself would not have pitched any differently. (A pitcher who allows 15 runs in 60 innings pitched would have an ERA for the season of 15 * 9/60 = 2.25. That same pitcher, with seven more runs allowed on the year due to other relievers allowing them to score, would see his ERA rise to 22 * 9/60 = 3.30.)

  We want our relievers to hold leads, which often means preventing inherited runners from scoring, but then we judge them on superficial statistics that don’t reflect whether they did so. Of course, getting outs is any pitcher’s primary job, and that will be reflected to some extent in ERA and to a greater extent in other statistics like his opponents’ OBP, but it’s too easy for a pitcher to come in, fail to clean up someone else’s mess, and leave with his ERA intact.

  The other major problem with using ERA to measure relievers is the result of the small workloads of the modern reliever: one really bad outing can torch a reliever’s season ERA, even if he pitches well in all of his remaining appearances.

  Hall of Famer John Smoltz came back as a reliever after missing the 2000 season due to Tommy John surgery, working a mostly full season in 2001 and becoming Atlanta’s full-time closer after spending his pre-knife career in the rotation. In 2002, he started the year by throwing a scoreless inning on April 1, and then had the worst outing of his entire career five days later, recording two outs while allowing eight runs (all earned) in an 11–2 loss to the Mets. His season ERA after that outing stood at a comical 43.50, and it never dipped below 3 for the rest of the season. But for the rest of the 2002 season, Smoltz pitched just like the old John Smoltz: 78.2 innings, 22 runs allowed (21 earned), 81 strikeouts, and a 2.40 ERA from April 7 onward, bringing his ERA down from 43.50 to 3.25 after the end of the year. He didn’t allow more than three runs in any other appearance in 2002, and in 69 of his 75 appearances on the year, he allowed one run or fewer. That one torching he received on April 6 added nearly nine-tenths of a run to his season ERA and meant the difference between finishing in the top 20 among major-league relievers in ERA that year and finishing 44th overall, just below the MLB median for full-time relievers with at least 60 innings pitched in 2002.

  Those eight runs he surrendered still counted—throwing out an outlier just because we don’t like it is not good science—but their effect on Atlanta’s season didn’t extend beyond that one game. Had he had two four-run outings, that would have been worse for the team. Four more two-run outings might have been worse as well, given the way he was typically used, with about two-thirds of his appearances coming with Atlanta tied or winning by one or two runs. For relievers, ERA or even RA is accurate, but it doesn’t tell us what we particularly want to know about the pitcher. For any pitcher, ERA includes all kinds of noise that obscures the part of run prevention for which the pitcher was truly responsible, at least to the best of our current understanding.

  Though these issues with ERA for starters and relievers are a part of the problem with ERA, the whole concept of the defensive play behind the pitcher is a more looming issue. For much of baseball history, what we long believed about pitchers and allowing runs was simple: if the pitcher allowed the runners to reach and score safely, those runs were on him. It’s his job to get outs, any way he can, whether it’s via strikeout or groundout or flyout. Every hit was his fault, so to speak, so he’d be charged with any runs resulting from that hit, whether it’s that hitter scoring or knocking in another run.

  Today teams look at the question differently, because we’ve increased our understanding of how defenses, from fielding prowess to fielder positioning, affect pitchers’ stat lines. The same pitcher might have two different results pitching the same way with the same balls put into play but two different defensive units behind him: the line drive Adrian Beltre fields at third base might get by Nick Castellanos, turning an out into a hit through no fault of the pitcher’s own.

  There is some pitcher effect on this phenomenon, because pitchers have some control over the types of balls in play they allow. Some pitchers generate lots of groundballs; some are more flyball-prone. Pitchers who give up lots of line drives tend not to last very long because a line drive is about three times as likely to become a hit as a groundball or a flyball. Infield popups are almost never hits, and sometimes are outs by statute (the infield fly rule, which applies with at least two men on base and fewer than two outs). So a pitcher can help his cause in this way, but he can’t direct a groundball right to a fielder and he can’t make that fielder any better than he actually is.

  So ERA is a noisy statistic, but that doesn’t mean it’s devoid of value. Stripping all context out of ERA to use a compo
nent-based alternative also means we’re removing some factors the pitcher really could control. I mentioned above that some pitchers exhibit a little control over the results of balls they put into play. It’s also true that some pitchers pitch noticeably worse with runners on base (that is, from the stretch, as opposed to pitching from the windup), which can contribute to a gap between their ERAs and what their components might indicate. A pitcher who’s terrible with men on base will allow more runs than a pitcher who allows the same rates of hits and walks but doesn’t lose effectiveness with men on. Analysts refer to this aspect of run prevention as “sequencing.” The order in which things happen matters, even when those things are independent of each other.

  The philosophical discrepancy between ERA and alternative measures draws from whether you want a number that is descriptive or one that is more prescriptive. The pitcher with the 4.50 ERA but whose peripherals say he should have had a 3.50 ERA may have just been unlucky, or he may have made poor pitches in high-leverage spots, or some of both. Using component-based stats absolves him of all sins. Using ERA and value-based stats that build off it reflects what actually happened—the pitcher let these batters reach base and they scored. I find this a little unsatisfying because it still hurts the pitcher working in front of a bad defense or whose bullpen comprised seven guys with gas cans and matches, but even in this somewhat theoretical world of what “should have happened” it’s important that we don’t lose sight of the runs on the scoreboard.

  In 2001, an analyst named Voros McCracken discovered that pitchers’ hit rates, or batting averages allowed, had little correlation year over year once you factored out the strikeouts, after he noticed that the pitchers who led the league in hits allowed tended to vary wildly from year to year and often included names you wouldn’t expect to see on the wrong end of the leaderboards. He showed that pitchers whose batting averages allowed on balls hit into play, now known as BABIP, deviated from the league average tended to regress toward the league average the following year, to the point where we would do better at predicting a pitcher’s ERA the next year if we assumed he’d have a league-average BABIP than if we used his actual BABIP. McCracken called the new system DIPS, for Defense-Independent Pitching Statistics, and his discovery underlies all advanced pitching metrics still in use.

 

‹ Prev