Chasing Perfection: A Behind-the-Scenes Look at the High-Stakes Game of Creating an NBA Champion
Page 4
But while his products focus extensively on ultra-micro analysis, the high-energy Maheswaran takes a very practical, high-level view of what makes his company’s work so potentially valuable.
“When computers understand anything, good things happen for everyone in that ecosystem,” he said while seated at an oversized conference table wedged into limited space in the left prong of the office. “A good example is when computers understood music, you got things like Pandora and Shazam, right? And so the equivalent is when computers understand sports, lots of good things happen. One of the fundamental premises for the technologies that we’re trying to develop is the ability for a computer to understand sports.”
But, Maheswaran notes, technical sophistication in the handling and processing of data is not enough to make for a good consumer product, even when you’re targeting a highly specialized internal audience of NBA teams and their coaching staffs. As good as the information may be, you have to be able to serve it in a way that will make sense to the end users, and won’t force them outside of the way they currently process information.
“So what we try to do is put numbers to words that people already use,” he said. “People already know that a shot is either a good shot or a bad shot, and we have mathematical models that actually quantify the shot—or the shot quality and the shooting ability. We also break down rebounding as not one thing; it’s three things. It’s positioning, attack, and hustle. So, words that coaches already use. But the biggest thing that we do is we use pattern recognition to identify things that happen in a game. So a pick and roll, and reject a screen, and a blitz, an ICE, all these things.”
The biggest hurdle for any external service like Second Spectrum is the potential for distrust of its findings. As longtime NBA coach Stan Van Gundy (who took over both the head coaching and personnel functions for the Detroit Pistons in 2014) famously railed on at the 2014 Sloan Sports Analytics Conference, many basketball lifers are reluctant to believe what a computer tells them. Van Gundy’s own hesitation was centered around his perception of the integrity of data input, specifically the qualifications of the people who were assigned to tag plays that fuel the systems of earlier market entrants like Synergy Sports Technology (more on them in a little bit, too).
As Van Gundy argued, there is potential for error when manually identifying and tagging data, especially if the operator is not really sophisticated in understanding the NBA game. Not everything is a straight pick and roll or an isolation or a catch-and-shoot jump shot. There are offensive actions that lead to other actions, which makes it somewhat inaccurate to label them as one specific thing. There are also many defensive tactics that aren’t cut and dry at all, especially when you don’t know the specific context of what the team was asking its players to do. Without knowing more specifics, it can be difficult to identify where a defensive mistake occurred, or why.
For coaches to get value from data, they have to have this kind of exactness. Otherwise, it doesn’t fully conform to their own qualitative ideology on both ends of the court, and can create dissonance when a data report presents information that may counter the coach’s intuition. The ramp-up of this kind of basketball understanding and precision as Maheswaran’s team developed their algorithms was the most crucial piece to building Second Spectrum from an academic idea into a successful commercial business.
“You can do tests against humans and say, ‘Here’s what our computer said, and here’s what a bunch of people said. [What] your stats said.’ And you can go against their staff, and maybe it’s not as good as their staff, but if it’s close, very very close, then you save them a lot of time,” Maheswaran said.
“The hard part about it is this concept of accuracy, both in terms of precision—if I say it’s a pick and roll, is it really a pick and roll?—and recall—how many of them do I actually catch? So there’s a balance. Because I could say, ‘Every single second of the game is a pick and roll,’ and I’ve gotten all of them, but my accuracy isn’t very good, or I could find the one pick and roll I’m sure is a pick and roll and say, ‘that’s a pick and roll,’ and my precision is perfect, but I missed a bunch. So how do you get high [marks] in both those areas?
“It turns out [for] a variety of people, it’s pretty easy to get 80–80. We had an undergrad who was able to get 80 percent precision and 80 percent recall in like three, four days. . . . The question is would that be OK for people who have large numbers [of data points]? And, there’s also bias. [If] you’re getting the 80 easiest ones, so you’re missing all the rejections and the slips [of screens], well, that’s a big deal. So the thing that we have brought to the table is the fact that we understand [these distinctions]. Our algorithms tend to be in the high 90s for [both] precision and recall. So we’re missing very, very few things. We’re basically better than most human beings, if you look at a collection of analysts. We can watch to an almost-human level.”
The result is Second Spectrum’s platform can parse anything from any number of games, and display the information in both digital and video form. The power of their output is staggering.
To give me an example, Maheswaran turned and started fidgeting with a laptop that was feeding a picture onto a large wall-mounted projection screen near the conference table. He tapped into the company’s Eagle system and queued up every pick and roll Clippers point guard Chris Paul had run so far in the 2014–15 season, and requested to see the plays unfold on video from three seconds before the actual screen was set (so we could see how Paul sets up the screens) through one second after the pick (to see what his initial decision would be).
These four-second plays looped continuously, one after another on the screen, and in a matter of a few minutes, we had visually consumed Chris Paul’s entire pick-and-roll repertoire from the current season. Work that would have taken a team’s video coordinator hours to compile a decade ago, now is available in seconds with a few mouse clicks. Maheswaran said that his system can even assuage the fears of the Van Gundys of the world, because it also can log actions that aren’t the final action of a play. It can essentially track anything you enable it to learn.
“The power of having good algorithms is that it’s like you have a million pairs of eyes watching every single game,” he said. “You can scale. Whenever [you] want to know about the game, it’s as if you have someone who has watched every single second of your opponent’s game and you can get very complete scouting, or every single second of a particular player that you’re trying to scout.
“It’s not that you go back and watch the last three, five games and have a video staff chop it up. You can basically get information about every single moment of the game. Every single pick that led to a post up, that led to a layup. The things that happened—maybe a pick and roll that was stopped, that’s something that you might want information about. Or a pick and roll that was defended well [and] that might’ve led to something else, that led to an iso, that led to a score, and somebody might report, ‘Well, there was an iso,’ but you don’t talk about all the other things that we stuffed.”
So far, Maheswaran’s bet on his and his colleagues’ technological chops—specifically the discipline of spatiotemporal pattern recognition, which he oversimplifies as the science of “moving dots”—is paying significant dividends. With around 30 percent of the NBA as clients, Second Spectrum started adapting its technology for other sports, including professional football and soccer. The group also found a new home quickly after the aforementioned visit, shedding the U-shaped office for much bigger space near City Hall, a couple of miles northeast of Staples Center.
While the technology is complicated, Maheswaran and his team have simplified the output to make for an elegant end-user experience. The use of language that basketball people actually use provides staffs with subject comfort as they seek out information that they think will help them make decisions. Knowing that coaches and their staffs often are less receptive when data is pushed onto them, Second Spectrum’s technology prom
otes the pulling of only the data that they’re interested in.
“We have not tried to necessarily generate a whole new class of things,” Maheswaran said. “We go to the coaches and say, ‘What is it that you would want to know that you cannot know right now? Or what is it that you want to do that you cannot do right now?’ And we use the fact that a computer understands to get that to them. We don’t come up and say, ‘We’ve got seven new magic metrics that tell you what you should do,’ because coaches really, I think, if given the information, know their teams well, know their constraints well, and will figure out what the right thing is to do.
“The question is: Do you have all the information you need, when you’re making your decision? And one of our things is the most informed make the best decisions, and there is a bottleneck in terms of how well informed you can be based on the technology and the manpower you have. And what we do is just give you basically one hundred times the power of having way more information, or one thousand times the power. I’m not sure quite how to measure this. The other thing we can do is because we have a computer that understands the data, we’ve also done some algorithms that allow the computers to understand the video.
“So here, now, coaches have the initiative to look at numbers,” he added, “[and since] we’ve used the power of the fact that the computer understands both the video and the data, you could essentially ask a question and get the answer both in numbers and in video, instantaneously. So that is a very, very powerful thing.”
Second Spectrum (and similarly oriented analysis businesses, such as MOCAP, a Silicon Valley company that works with the Golden State Warriors) is a next-generation development of the NBA’s original data and video compilation technology provider, Synergy Sports Technology. That company was founded by former college and NBA assistant coach Garrick Barr back in 2005, and the origins of his service, which continues to thrive and is used by all thirty NBA teams and virtually every Division I men’s and women’s basketball program, come from very serendipitous roots.
As Barr explains it, back when he was a Phoenix Suns assistant coach in 1992, he had walked into a store to buy some music equipment. Inside the store, Barr noticed that the retailer also sold audio/video equipment, and his interest was piqued by a low-end, first-generation AVID nonlinear digital video editing machine. Tape cutting at the time was very laborious, and the state-of-the-art equipment was extremely large and impossible to bring on the road, so teams weren’t able to replicate the film work they did at their own facilities when they were on road trips. If anything was put together, it would come from the home office and be sent out via FedEx or a similar carrier.
Beyond a reduction in size and the ability to potentially travel with the new equipment, Barr also understood the advantages of being able to streamline the tape-cutting process, potentially allowing the Suns to splice tape during games for use in halftime meetings and in-game strategy adjustments. He asked the store owner if he thought the AVID editor could be built into a protective travel case. It could, and it was, and Barr suddenly had provided the Suns with a very significant advantage while starting to push pro basketball out of the deck-to-deck tape era (with AVID forming its own business called AVID Sports Pro to leverage the idea). Barr also says that the Suns at that time built the first comprehensive team-scouting database, which was phone-synced so everyone was able to see the same data and could review each other’s reports.
These developments helped steer Barr away from a potential coaching career and toward a career creating technology designed to help coaches. In 1998, Barr partnered with another college basketball coach, Scott Mossman, to create Quantified Scouting Service, which produced computer-generated data reports based off of the company’s screening of game videos. This was still much too early for video streaming that could layer clips on top of numerical data being generated. Instead, Mossman and his wife used VCRs and a satellite dish to capture as many games as they could, and then farmed the tapes out to “loggers,” who tagged every play for their system. Barr’s clients then were able to use dial-up modems to access the reports.
A few years later, with video streaming at a level where it was reasonable to create a platform that paired video with the plays the loggers were tagging, Barr partnered with engineer Nils Lahr and rechristened the company as Synergy Sports Technology. By 2008, they had a licensing agreement in place with the NBA to provide their data to the league’s television and digital arms. Today, Synergy is a market-dominating, cross-sport technology phenomenon.
“The benefit of Synergy was that you were no longer tied to a local piece of equipment where you do all the work,” Barr said. “Instead, [now] it’s cloud-based and we do all the work. We do 80 percent of the tagging. You can still tag things and those can be associated and cross-pollinated with our data in custom reports. Now they can pull up any game that we’ve tagged. They can go through and tag all the play calls: fist, two out, fist down. It’s like baseball. Every team has a different set, so we can’t resolve that. We can’t figure out thirty different teams’ playbooks and the calls that they have. The plays tend to be the same, but they call it different things.
“So if the team will take fifteen to twenty minutes and tag a team’s play calls, the result is staggering. They end up with a report that shows the breakdown of what happens each time they run those plays, what play types are run as a result [of the set]: pick and roll, post up, iso, whatever. You can see where your stars are getting points or not. You may be running something that you think is designed to help your star get off, to get him a touch when you really need it, but it might turn out that he doesn’t get that and you don’t even know that. You can see which plays result in offensive rebounds or threes. You can see the proportions on everything, the points per possession on everything.
“You’re essentially taking three data sets and combining them to produce that report. There’s the stuff that we logged, the stuff that they logged, and the stuff the scorer’s table logged, so you got points, assists, and rebounds and all that stuff; you have shot locations—in college, based upon us tagging it; in the pros, they do it, so we just use their locations. And then you have all the play-type information we log, as well as the defensive side of the ball—who was guarding the play and what happened. Who the players are, what the play type was, where on the floor were they, what direction did they go, what was the move that was used. And then what was the ultimate result, and that’s all tied to the scorer’s table data—the stats feed—so we know who had the assist, whether it was a three or a two. We don’t have to tag all of that since it’s integrated data feeds.
“The result is you have a report that tells you what your team or the opposing team does, what proportion, how good they are, what they get out of it, how they set up, etc. And any way you want to, you can click on a matching data point and get the related video.”
Barr said he is aware of what companies like Second Spectrum are doing—in fact, Dallas Mavericks owner Mark Cuban, an investor in Synergy, brought the two parties together for a bit for potential collaboration—and, really, Synergy is providing similar types of data, similar access to video clips, and similar ease of use/accessibility for the end user. Barr makes sure his loggers also use “coach-speak,” so coaches are comfortable with the terminology returned in the queries.
While neither CEO specified his exact price points, conversations with teams that utilize their products confirm that Synergy is more affordable than Second Spectrum’s solution, which certainly has some appeal. There also are differences in the outputs. Synergy’s output, while really robust and increasingly aided by data visualization, doesn’t have Second Spectrum’s same granularity, as plays on Synergy are tagged according to the final action that leads to a shot, foul, or turnover, rather than breaking down the possession into multiple elements. Also, while you are able to pull up sequences of video like the Chris Paul example earlier, each clip is of the entire possession. If you want to carve out more specific ti
me periods within a possession, you would need to go into Synergy’s nice video-editing capabilities to do that.
The quality of any output, though, is dictated by the quality of the data being used to generate it, and as such, the principal discussion in comparing the products of machine learning versus human loggers centers around the concept of “what is good enough?” While there are a few NBA teams that have brought most things in-house and hired a slew of programmers and data scientists to extract value from all of the different data being collected, most NBA teams don’t require—or want, for now—that level of rigorousness. A comprehensive outsourced solution with an acceptable standard of quality is a perfectly fine solution for them. They get much of the benefit without a lot of the operational headaches of hiring up and maintaining a database that still requires integration across other platforms to match the capabilities of top third-party providers.
Barr believes his loggers’ accuracy is more than good enough. He touts the company’s rigorous standards for quality assurance, where data tags go through a series of second checks and global spot-checking for accuracy, and also he knows that if his work wasn’t accurate enough, he would hear about it directly from the thousands of coaches he works with. That said, Barr also understands the benefits of better technology, so he’s interested in the ongoing tech-driven processes that are converting SportVU motion data into valuable output. (As of August 2015, Synergy had not yet incorporated SportVU into its overall product.)
“One of the first things Lars asked me [when they started this] was ‘Can this be automated?’ And that would be great. That’s the Holy Grail. I get it,” Barr said. “Ultimately, human beings are fallible, and if you can perfect machine recognition, maybe you can eliminate the fallibility, although I think our error margin is completely acceptable by all coaches at this point, and pretty much all along.