CHAPTER 1
A Brief History of Modern Basketball Analytics
The similar revolution in baseball took a couple of decades, at least, and this took about half the time, in part because [baseball] helped pave the way.
—Kevin Pelton, NBA writer for ESPN Insider
Long before culling data from gigantic sets of inputs became a highly valued NBA front-office skill, and fans increasingly accepted various types of quantitative analysis as a growing necessity to better understand how the sport is played, an unknown economist may have spearheaded the first effort to use computers and statistics to project professional basketball performance.
Louis Guth was a senior vice president in the New York offices of National Economic Research Associates (NERA), an economics consulting firm, in the early 1980s. At the time, Guth was providing advisory services to the North American Soccer League in its antitrust lawsuit against the National Football League, one that alleged that the NFL’s prohibition on cross-ownership (owning franchises in multiple sports leagues) was damaging the soccer league by denying it access to potential sports capital investment and operational expertise.
As part of his work for the lawsuit, Guth had to conduct economic analyses of the value of sports franchises. As detailed in a July 1980 New York Times article that Guth authored, he determined that sports franchises, especially those in major metropolitan areas with high levels of per capita income, such as New York City, were inherently undervalued based on how they were priced when they were bought and sold in that era.
Guth’s analysis focused on the intrinsic, long-term values of franchises, which were based more heavily on factors like national TV revenues and the possible value of the home market at large, instead of ticket revenues and/or the current state of the franchise in terms of personnel. Those latter factors, Guth claimed (and was right about), were easily correctable with the hiring of better management and players, and had very little to do with the value of the franchise as an asset. Guth also smartly realized the unique position that sports franchises in that era had in terms of unpaid promotion through the major media entities in their respective cities.
“I said, ‘Look, you guys are getting two to three pages of newspaper [every day].’ Other businesses would kill for that,” Guth recalled via phone from Florida, where he is now retired.
In the Times article, Guth used that era’s New York Mets, who had just been purchased for what was an all-sports record price of $25 million by Nelson Doubleday and others, as a prime example. The Mets were terrible when they were purchased, but the club quickly developed a core of good players, traded for more, and eventually won the 1986 World Series. While that was happening, ticket sales for the team exploded, and the promise of the asset was realized.
When Guth was done with his work for the trial, the valuation analyses he had done made him think more specifically about how that type of work, aided by early-era computer technology, could be applied to sports themselves—and more specifically, to the monetary and performance value of players. While baseball already was in the early stages of its own statistical analysis revolution, driven by stats pioneer Bill James and the Society for American Baseball Research (SABR, which is the acronym that spawned the term sabermetrics for baseball analysis), Guth didn’t see any comparable presence in professional basketball. That was in large part due to the inherent differences between the two sports. Baseball was a far more popular spectator sport than the NBA was during that era—one in which the NBA Finals were still shown late-night on tape delay—but more important was the nature of the sport and the history involved in its record keeping.
Baseball is a game of discrete, one-on-one, well-defined interactions between a hitter and a pitcher, and while current-era data analysis has expanded our understanding well beyond what was happening in 1980 (especially on the defensive side of the sport), it’s still a much simpler sport to analyze than basketball, in which each play on the court involves ten players moving in dynamic, undefined, and unlimited patterns. It is quite easy to determine exactly how much offense a batter is able to produce or how effective a pitcher is in limiting opposing offenses. It is much more difficult to accurately assess the value of individuals in a sport of team-based actions.
That whole series of factors created a market opportunity that Guth was eager to step into.
“To my knowledge, there wasn’t a heck of a lot of statistical analysis on basketball, and I migrated from other things I was doing because it looked like a wide-open field at the time,” Guth said. “You had Bill James coming out with his baseball people and were looking at numbers a lot, and there was an early piece in the American Economic Review [about baseball]. But, to my knowledge, I’m not sure there were other things out there [about basketball].”
Guth set out to examine basketball through the lens of the economic principles that underscored his normal work. In his mind, a lot of the work at the time being done on baseball dealt with estimating the value of what he called the individual players’ “marginal product,” which, in economics parlance, is the output that results from one additional unit of a factor of production. Essentially, once you determined what a batter’s capabilities were, it was reasonable to be able to project how he would do in a series of individual at-bats and to determine his composite product by adding up all of his estimated at-bats for a season. Additionally, while baseball games are constrained by outs, they are not constrained by a particular number of at-bats, or production “units,” so to speak.
Because of its team-based, dynamic nature, basketball isn’t nearly that linear and is much more complicated. The other four players on the court with a particular player directly impact his ability to produce, for better or worse. Also, because professional games are 48 minutes long, with five players on the court at any one time, you are constrained to 240 total minutes of production for a game. As a result, as Guth explains, “any time you add somebody, you can’t say he brings all the talent he has. He also replaces somebody,” which has to be accounted for in the analysis.
By 1982, Guth had created a database of all available NBA statistics from the league’s most recent few seasons, and built a proprietary program first called FAMS (Free-Agent Market Simulator) and then FAME (Free-Agent [and Trades] Market Emulator) that crudely allowed him to determine a value for adding a new player to an existing team. It was groundbreaking stuff. As Guth noted with a chuckle during our phone conversation, his biggest mistake may have been in how he marketed his output.
“I should have called the stat ‘wins against replacement,’” he said, paraphrasing a calculation that’s now commonly used in sports in similar replacement analyses. “WAR is a fundamental Economics 1 concept, adapted to the reality of sports, which over the course of the season is pretty well set.”
As detailed in an August 1982 Sports Illustrated article written by now-famed basketball writer Alexander Wolff, Guth became most well known for his analysis concerning Moses Malone, the league’s reigning MVP and a future Hall of Fame center who at the time was a free agent after having completed his contract with the Houston Rockets. Thanks in part to collusive efforts of the NBA owners at the time, Malone was not receiving offers from franchises other than Houston, and Guth believed that to be a huge mistake on those other teams’ parts.
Guth’s economics roots shaped a system that focused on which teams should pursue a player like Malone based on the projected financial gain that player would give a new team, driven by improved performance on the court in relative terms. But when he focused more singularly on the projected on-court performance of the new player, exclusive of the monetary aspects, the story changed a bit in terms of where a specific player like Malone would make the most impact.
As detailed in a NERA company newsletter in 1984, much of Guth’s system was based on proprietary formulas that tried to place values on teams’ outputs at both ends of the floor. Offensive rebounds factored strongly into Guth’s offensive formula, and in that e
ra, there was one top-level team that seemed to have many of the ingredients of a world champion, but was relatively weak on the offensive glass: the Philadelphia 76ers.
Guth went and compared the 76ers to the two other premier teams in the league at that time: the Boston Celtics, who had won the 1981 NBA championship, and the Los Angeles Lakers, who had won the title over Philadelphia in both 1980 and 1982. Against both of those imposing foes, the 76ers had an offensive efficiency disadvantage, especially against the Lakers thanks to the dominant inside scoring of Hall of Fame center Kareem Abdul-Jabbar.
The easiest fix to that problem as Guth saw it, based on his formula, was for Philadelphia to improve its offensive rebounding. During the 1981–82 season, the 76ers had only collected 1,031 offensive rebounds, which Guth calculated to be a 30 percent offensive rebound percentage. That was far below what the Celtics and Lakers were doing on that end, and something that could be fixed very quickly with the addition of a dominant offensive rebounder. It so happened that one was potentially available during the summer of 1982 in Moses Malone.
The free-agent rules at the time allowed a player’s previous team to have right of first refusal on releasing a free agent that signed an offer sheet with another team. In a move that would make modern-day front-office personnel tip their caps in appreciation, the 76ers attempted to load their offer sheet for Malone with financial incentive clauses that were designed for the Rockets not to match, so Philadelphia could sign Malone without providing any compensation to Houston.
The case ultimately ended up in arbitration, where the 76ers were determined to have violated multiple league rules in the structure of their offer sheet, and the Rockets eventually matched the modified version. That then allowed the Rockets to trade Malone to the 76ers in exchange for forward Caldwell Jones and the 1983 first-round pick of the Cleveland Cavaliers, who were expected to be terrible and in contention for the No. 1 overall pick. (As it turns out, the Rockets collapsed without Malone, winning just fourteen games in 1982–83 and winning the coin flip for No. 1 themselves. They selected Ralph Sampson with that pick, and also got Rodney McCray at No. 3 with the pick obtained from the Cavaliers through the 76ers.)
Meanwhile, the 76ers had just acquired the big-time rebounder and defender they needed to add to a terrific core of Julius Erving, Andrew Toney, Bobby Jones, and Maurice Cheeks. From the outset, Philadelphia’s new arrival, even though he was the reigning league MVP, seemed to understand his role.
“I know it’s Doc’s show,” Malone told the Philadelphia Inquirer after the trade, referencing Julius “Dr. J” Erving’s status as the team’s main star, “and I’m happy to be part of Doc’s show. . . . Doc’ll still be the show, but maybe now it’ll be a better show.”
Indeed, it was. The expected uptick in offensive rebounding thanks to Malone’s arrival helped close the projected offensive efficiency gap between the 76ers and Lakers in Guth’s model, and bumped his regular-season forecast for the 76ers up to sixty-six wins. He was pretty much spot on. The 76ers ended up corralling 1,334 offensive rebounds (a top-50 total in NBA history), went 65–17, and rolled to their first world championship in sixteen seasons, going 12–1 in the playoffs and sweeping the Lakers in the Finals.
“The 76ers were not the one [Malone] would contribute the most to,” Guth recalled when asked about the analysis, “but when he came to them, it led to the prediction that they would advance.”
Had this happened maybe twenty years later, Guth would have received more recognition and interest in his work, but he said after the 76ers’ projection panned out, he didn’t hear from any NBA teams about his system. There just wasn’t much interest at the time in computer analysis.
“It was almost a one-shot deal,” Guth said. “We probably did it for a couple of years, then I got heavily involved in the baseball [antitrust] hearings, and also got involved with the PGA Tour.”
Still, Guth thinks back to the days of the Hewlett-Packard mainframe and its findings, many of which turned out to be very prescient, even with the relatively limited amount of information at the time. Seeing what the industry has become today, with billion-dollar franchise values and entire submarkets built around Big Data analysis, he wonders what might have happened had he stuck with it and trusted his innate sense that all things in sports were undervalued in that era.
“If I applied my own analysis,” Guth noted, “I should have built my own consulting firm.”
The principal keeping of basketball statistics, basically since the beginning of the game, has centered around “counting” stats—the numbers observers can compile just by watching the game and adding. It’s easy to track the number of points a team or player scores in a game, or their scoring averages, or how many shot attempts and makes there were. You can easily count rebounds for individual players, and sum them up for each team. You can track assists by whatever definition you create to identify one. Eventually, “steals” and “blocked shots” also became official categories. And all of those stats ended up being compiled into box scores that were printed in newspapers around the nation as a way to summarize what had happened in a game.
So-called advanced basketball statistics, in the way that we currently understand and continue to evolve them, date back at least as far as 1959, when then-North Carolina head coach Frank McGuire authored a book called Defensive Basketball. In it was a section written by then-assistant coach Dean Smith (who would go on to his own Hall of Fame career as the Tar Heels’ head coach) that discussed how to evaluate the effectiveness of a team’s offense and defense not by its raw totals of points scored or allowed, but by how many it tallied or conceded “per possession.”
North Carolina set offensive and defensive scoring targets in that era on a per-possession basis, with Smith writing in the book that the Tar Heels wanted to keep teams below 0.63 points per possession while scoring more than that figure. Their methodology at the time considered an offensive rebound to create a new possession, and the book emphasized that defensive rebounding was crucial so the Tar Heels would end up with more possessions—read: more chances to score—than the opponent. Those “extra” possessions on both ends lowered the points-per-possession target below modern averages, as any possession that ended with an offensive rebound would be included as worth zero points. Today, in order to keep the number of possessions as equal as possible for both teams, offensive rebounds are considered part of the same offensive possession.
However North Carolina was defining possessions, Smith is widely cited as the first person to understand that the pace of a game had a significant role in determining just how good or bad a team was on either end of the floor, because composite statistics and averages don’t take into account how many opportunities a team had during the course of a game. If two different teams each average eighty points a game, but one plays at a pace of seventy possessions per game and the other plays at ninety possessions per game, the team with the slower tempo has a much more effective offense (rounded to 1.13 points per possession) than the faster team (0.89 points per possession). The most lethal offensive teams (like the 2014–15 Golden State Warriors) play at high-possession tempos while registering great points-per-possession numbers, but most teams have a tradeoff on tempo versus efficiency once they speed up to a certain level.
The modern origins of basketball analysis, though, stem from baseball—more specifically from the work and impact of Bill James, widely considered to be the Godfather of advanced sports statistical analysis. Shortly after graduating from the University of Kansas in the early 1970s, James began positing about baseball in new and unusual ways. After finding resistance from traditional media outlets who didn’t really understand or appreciate his work, James started self-publishing his now-famous annual Baseball Abstract in 1977 (and continued doing so until 1988). From there, James went on to publish a sizable number of additional books, and was hired by the Boston Red Sox as a consultant in late 2002. As of August 2015, he was still with the club in an advisory role.
>
James is responsible for a huge number of statistics that either have maintained their relevance or served as a launching point for additional study, and in the process, his baseball work spurred others to try to mimic significant parts of it for basketball. Among James’s most famous concepts were runs created, which attempted to identify a specific player’s responsibility for his team’s run-scoring; Pythagorean winning percentage, which used run differential to establish what a team’s record “should be” versus what it actually was; and win shares, which was a catch-all statistic designed to gauge a player’s contribution to his team’s success, allowing for cross-position and cross-era comparisons of players.
Right as James was coming to his decision to cease publication of his annual abstracts, a handful of basketball-related analysts started working on what became dubbed as APBRmetrics—honoring the Association for Professional Basketball Research—with many of the earliest practitioners working on offshoots of James’s seminal stats that could apply to basketball. Here are some of the biggest names in the early advancement of basketball analytics, in approximate order of their time of prime impact:
• Dave Heeren is considered one of the forefathers of the basketball analytics movement. He created and further adapted TENDEX, which is credited with being the first linear-weight basketball metric. A linear-weight metric assigns positive values for good events and subtracts value for negative events to come up with a relative figure for player performance. It is fairly easy to calculate and understand even if it is not as nuanced or complete as a nonlinear calculation like wins over replacement value. Heeren once worked as a statistician for the New York Knicks but is better known for his annual basketball books called Basketball Abstract that were popular in the early 1990s.
• Martin Manley became prominent in the same time period that Heeren did, and published his own annual books called Basketball Heaven. The output of his player evaluation formula, which was nearly identical to Heeren’s but also included the impact of turnovers committed by a player, was dubbed “Manley Credits,” and became the basis for the NBA’s own efficiency rating. Sadly, Manley, who also wrote for the Kansas City Star, may be most well known for how he died, committing a meticulously planned suicide outside an Overland Park, Kansas, police station on his sixtieth birthday in 2013, and leaving behind a detailed website that explained why he took his own life.
Chasing Perfection: A Behind-the-Scenes Look at the High-Stakes Game of Creating an NBA Champion Page 2