by Keith Law
Statcast’s radar-based system from TrackMan makes all of these ball-related measurements simple; tracking the ball’s velocity, spin rate, and trajectory is part of what the system tracks for pitches. The only additional element the system adds after the pitch is a tag of pitch type—two- or four-seam fastball, curveball, slider, changeup, and so on—based on the pitcher’s known repertoire and the other characteristics it’s already measuring. If the pitcher has just three clearly defined pitches, the system might be able to accurately identify all of his pitches after just two or three “training” starts. Some pitchers can take more samples for the system to get to the desired 99.99 percent accuracy because their pitches’ velocities might run together, which you’ll often see for a pitcher who throws a fastball, a cutter, and a slider, since a cutter is best described as a hybrid of the other two pitches. But after a few starts or a handful of relief appearances, any pitcher will be tagged accurately in the Statcast system, allowing analysts to evaluate the expected value of his various pitches based on factors like velocity and spin rate.
One of the most visible statistics to come out of MLB’s Statcast product is exit velocity—the speed with which a batted ball leaves the bat. We’ve all heard about hard contact, or hitters who can “square up” the ball, but this was the first time MLB was able to provide hard data to support or refute these assertions on certain hitters or pitchers. For hitters, the assumption was that harder contact was better—it would lead to more hits and more power. For pitchers, the assumption was that harder contact was worse; the Pitch f/x data stream allowed teams and sites like Fangraphs to rank pitchers by groundball rate or line-drive rate, but didn’t distinguish between how hard those balls were hit.
As it turns out, the relationship between exit velocity and results is not direct or linear—that is, hitting the ball harder is not automatically better. There’s another variable, launch angle, that comes into play as well; if you hit the ball hard, but hit it down into the ground, then you’re going to hit . . . a very hard groundball. That might sneak through the infield, but it’s probably going to be just a single if it does, and it’s also likely to be a hard-hit out, perhaps even the start of a double play. If you hit the ball hard—even hard enough to undo the seams—but get under it too much, hitting it high in the air, it’s likely to come down short of the outfield fence.
There is a range of angles off the bat, measured by the degrees between the ball’s trajectory when it leaves the bat and the ground itself, that is most likely to produce extra-base hits, which of course are the most valuable outcomes for hitters. In April 2016 for fivethirtyeight.com, which is co-owned by ESPN, Rob Arthur wrote in a piece called “The New Science of Hitting” that balls hit at 90 mph or greater with launch angles around 25 degrees (with several degrees of range around that figure) were pretty likely to leave the ballpark.
In September 2016, MLB introduced a category of batted balls called “Barrels,” which took any combination of exit velocity and launch angle that produced a batting average of at least .500 and slugging percentage of at least 1.500. In simpler terms, it means that balls hit at least that hard and in that range of angles are at least even money to become hits, with many of those hits producing extra bases. In a piece by MLB’s Mike Petriello that introduces the metric, he says that only about 5 percent of plate appearances end in a “Barrel,” and even the best hitters in the game produce a ball in play like that in only about 10 percent of their trips to the plate. The leaders in the category are names you’d expect: Miguel Cabrera, Mike Trout, and Kris Bryant, three of the game’s best overall hitters, are in the top ten, as are all-or-nothing power bats like Mark Trumbo and the two Davises, Khris and Chris.
We’ve long guessed that harder contact was better, and that hitters needed some loft in their swings—that is, to come up slightly through contact rather than swinging parallel to the ground—to produce line drives and hit for power. Statcast data has not only verified those guesses, but put parameters around them so teams can better identify hitters with the strength to hit balls at least 90 mph and who have the angle in their swing finishes to hit more balls like those MLBAM calls “Barrels.” It also gives coaches at all levels a potential goal for working with young players to develop their swings—if a prospect is making hard enough contact but doesn’t have the correct angle in his swing, whether too much (producing pop-ups and flyouts) or too little (producing groundballs), that’s at least an opportunity for a mechanical change, which could be as simple as changing the hitter’s hands in his load or as involved as reworking how he uses his hips to make his swing more rotational.
Statcast data has also probably put a few baseball myths to rest. Did Mickey Mantle really once hit a ball that would have traveled 565 feet? Almost certainly not, since even today’s hitters, who are bigger and stronger than those of a half century ago, can’t reach that distance. Did Negro Leagues legend Josh Gibson once hit a ball clear out of Yankee Stadium, or one two feet from the top of the facade, in either case hitting a ball that would have gone at least 580 feet? Absolutely not. These legends were always apocryphal, but now we can say with near certainty that they’re bogus.
Another new term to come out of the Statcast data stream is “spin rate,” referring to the number of revolutions per minute a pitch makes on its way to the plate. Again, this takes a common scouting term—spin or rotation, referring to the same thing, with “tight rotation” on a breaking ball especially assumed to be superior—and quantifies it in a way that allows teams to make better decisions. As with exit velocity, however, spin rate doesn’t work in isolation, as its effect is greater when paired with velocity.
Scouts have long associated spin with breaking pitches—the spin of a curveball, combined with the ball’s uneven surface, gives it its break, and depending on how the pitch is released and how hard it’s thrown (that is, how fast the pitcher’s arm is moving at release), it may break a lot or a little, in one plane or two, “early” or “late” from the hitter’s perspective. But all traditional pitches spin, even four-seam fastballs, the pitches associated with the least movement from release to home plate; the only pitches that spin very little are knuckleballs, where the whole point of the grip is to minimize the spin on the pitch.
Spin on four-seam fastballs turns out to be important, as important as velocity and in some cases more so. The industry’s fascination with velocity is nothing new, and is easy to understand. One, we all love seeing “100” pop up on a radar gun or a scoreboard; I was in the ballpark in September 2010 when Aroldis Chapman, then a Cincinnati Red, hit 104 on my own radar gun and 105 on the Petco Park scoreboard, eliciting cheers from the theoretically hostile crowd before the operators turned off the readings to stop fans from applauding the wrong side. Two, velocity has been easy to measure for some time now: radar guns are de rigueur for scouts, reasonably priced for teams (about $1,100 for an industry-standard device), and accurate enough to be a major part of scouting decisions from the draft to trades and pro signings.
Spin rate, on the other hand, was never measured at all, and in my experience only discussed on breaking pitches, curveballs, and occasionally sliders, although the latter were more commonly defined by their “tilt” rather than their spin. (Tilt here refers to the angle of break on the slider. Throw it too flat and you get a “frisbee” slider that often just moves right into the hitting plane of a hitter on the opposite side of the plate.) The TrackMan system has allowed teams to measure spin rate accurately for major-league pitchers and, where the system is offered, at certain amateur scouting events as well.
On four-seamers, spin rate can compensate for lower velocity, while higher-velocity fastballs without spin tend to be less effective than those with more spin and a few mph less of velocity. For example, Statcast data from 2016 showed that four-seam fastballs of 95 mph with spin rates at 2,400+ rpm (which, loosely defined, is above average) generated swings and misses about 10 percent of the time, roughly the same as the swing and miss rate f
or fastballs of 100 mph with spin in the more average range of 2,100–2,300 rpm. Throwing harder is good, but not necessary if you put more spin on the pitch. This has implications for scouting, for developing pitchers, and, in light of recent studies showing that pitchers who throw at the top end of their velocity ranges are at higher risk of arm injuries, perhaps even for finding ways to keep pitchers healthy.
It also explains why some pitchers who throw exceptionally hard don’t miss that many bats. Nate Eovaldi, who at the time of writing had just torn the UCL in his right elbow for the second time, necessitating the second Tommy John surgery of his career, has consistently ranked among the hardest-throwing starters in baseball throughout his career, and became a valuable trade commodity in large part due to that one pitch. In 2016, prior to his injury, Eovaldi averaged 97.1 mph with his four-seamer, the highest average velocity of his career, but more than half of his fastballs came in with spin rates under 2,300 rpm. In lay terms, that means they had less spin than the typical four-seam fastballs at that velocity, so while they reached the plate quickly, they were spinning at a slower rate and had less movement. “Straight ball I hit it very much” holds as true in real life as it did for the fictional Pedro Serrano.
One potential application of this data is looking at “expected swinging strikes,” as blogger Andrew Perpetua wrote on Fangraphs in September 2016. We can look at all pitches of a certain type, velocity, spin rate, and location in or out of the strike zone and see how often hitters swung and missed at that category of pitch, and then identify pitchers who throw that sort of pitch and work on getting them to throw to that particular area. Or teams could find pitchers whose swing and miss rates on those pitches were abnormally low, hoping that this was just bad luck or randomness making a pitcher less effective than he would be going forward. Or maybe it’s just a lot of noise—the point here is that such questions were unanswerable before Statcast, and now analysts have so much data that they can try to answer old questions and even come up with new ones based on what’s now being measured.
The most immediate impact of Statcast data on the team side has been allowing teams to refine their defensive evaluations to a degree impossible prior to this new product—grading players’ defensive abilities and also improving how they position players for certain hitters or behind certain pitchers. The most substantial change has come about because now analysts can evaluate range based on where any particular player was standing at the moment the ball was put into play.
The idea of “range,” like many other player tools or skills I’ve discussed in this section and in the scouting chapter, has long been a part of how scouts and teams evaluated players, but prior to modern statistical analysis, it was entirely subjective and based solely on what a scout might see in a sample of a few games or even just a few innings. If the scout didn’t see a player have to make a difficult play, or simply wasn’t bearing down on that fielder at the time of such a play (because he was focusing on the hitter or the pitcher or something else), the evaluation would be incorrect. And even evaluating range on the tiny number of nonroutine plays a player, especially an outfielder, might have to make in a handful of games was probably folly.
Every team analyst I asked about this topic had the same answer; Jason Paré of the Marlins said it was a “game-changer,” and James Click of the Rays pointed out that it eliminates many of the assumptions everyone, inside or outside of teams, had to previously make when evaluating defense. Sig Mejdal of the Astros said it was like two people trying to compare the speeds of two runners, but only one person has a stopwatch. “If we’ve got a stopwatch and we know Usain Bolt runs faster than Justin Gatlin, you aren’t right even if you swear Gatlin is better.”
Where public defensive metrics like Ultimate Zone Rating and defensive Runs Saved are directionally correct—if a player posts a UZR over two seasons of +15 runs, he’s almost certainly an above-average defender—they can’t match the precision that teams’ proprietary metrics, based on Statcast data, are able to offer. So while every analyst I asked about these public metrics said they’re worth looking at and a substantial improvement over any previous attempts to quantify fielding value (which one analyst referred to as “nothing”), teams are going to know more about defensive value than those of us who don’t have access to Statcast data.
With Statcast data, now an analyst can gauge whether a fielder is better than his peers, without worrying that where he’s being positioned by his coaches is screwing up the results. On each play, Statcast provides each fielder’s starting position, the trajectory of the ball, the route taken by the fielder who fielded it (or tried to and failed), that fielder’s running speed, and how far off the ground the ball was when the fielder caught it. You can look at a player’s maximum range, or typical top speed when making a difficult play. Statcast’s own tools allow anyone to make a graphic of all of an outfielder’s plays with the starting positions all normalized to a single spot (called a “polar” view), making it look sort of like a spider with a billion legs—but it’s immediately apparent how much or how little range a player has from a graphic like that, because you can see how far the “legs” go in each direction away from the starting position.
This insight does more than just tell teams who the good defenders are; it has already helped change the way teams position their current defenders for each batter. If you know your center fielder has good range coming in—on balls that are likely to fall or be caught in front of his starting position—but has a weakness on balls that require him to run backward, away from home plate, then you will likely start him a few feet farther from home plate, compensating for his weak spot and taking advantage of his strength. If you know your shortstop makes a higher percentage of plays to his right—the “hole” between short and third base—but lets an atypical percentage of balls get by him to his left, you might set him up a step or two toward second base to compensate.
Furthermore, Statcast data can help further refine positioning beyond the simple shifts, such as moving a third infielder to play in short right field against left-handed pull hitters, that we’ve seen in baseball over the last five years. Now the most advanced teams, like the Astros, Cubs, and Rays, are modeling likely outcomes based on the hitter, the pitcher, and the capabilities of the fielders to determine where the ball is most likely to be put in play and to station defenders in those spots. It ruins the symmetry of the standard defensive alignment—Mejdal specifically cited the emotional satisfaction we get from a traditional setup that “looks right and feels right,” but that the data say is wrong. It’s better to get more outs, even if it looks ugly on paper, than to set up a pretty defense that lets more balls fall in for hits.
Taking all these different new pieces together, Statcast data are the next frontier in statistical analysis. Where the metrics I described in Part Two by and large took the data we already had and figured out how to interpret them in more meaningful ways, Statcast data completely change the player evaluation paradigm. OBP and FIP tried to better isolate how often players did something of value. UZR, Batting Runs, and WAR attempted to value performances in terms of runs or wins added to the team. Statcast gets more granular, giving teams the raw data to evaluate players’ specific skills—how hard a hitter hits the ball and at what angle, how fast a player runs in the field or on the bases, how fast a pitcher’s curveball spins, how far out in front of his body he releases each pitch.
The old data got us to the atomic level, but Statcast data get us to subatomic particles we couldn’t measure before the new technology arrived. Where this takes the industry is baseball’s next frontier.
18
The Edge of Tomorrow:
Where the Future of Stats Might Take Us
The sabermetric revolution in baseball has already happened. There are no longer any holdouts among MLB front offices; by the start of 2017, all thirty organizations had established analytics departments, employing multiple people, often with Ph.D.s in computer science specialties, c
harged with gathering data and using them to answer questions from the GM or the coaching staff, or to look for previously undiscovered value in the market for players. If your local writer is still talking about players in terms of pitcher wins, saves, or RBI, he’s discussing the role of the homunculus in human reproduction. The battle is over, whether the losers realize it or not.
The way in which teams use analytics—a catch-all term that covers both the collection and storage of data as well as the use of those data to produce usable insight—has changed over the last fifteen years, to the point now where it is standard for teams to employ departments full of analysts who have distinct jobs and vary in their levels of interactions with the traditional baseball people. The Pirates have had someone travel with the major-league team to road games for several years now, helping manager Clint Hurdle and his staff work on positioning defenders, a major factor behind their three straight playoff appearances from 2013 to 2015. When current Houston GM Jeff Luhnow was the scouting director for the St. Louis Cardinals, he and his consigliere Sig Mejdal introduced a program that took in players’ performance data as well as information found in the scouts’ reports and ranked all the players in an objective fashion, telling Luhnow, in effect, which player to select each time the team picked. The Astros now use a similar model, as do multiple other teams, following the Cardinals’ example; the Angels, long considered one of the less statistically savvy teams, are the most recent club to adopt a draft-room algorithm that turns drafting into an actual process rather than a set of opinion-driven and inconsistent decisions. A decade ago, using analytics in baseball operations represented a competitive advantage; now it is a business necessity, if for no other reason than to understand what your twenty-nine competitors are up to.