Book Read Free

What to Do When Things Go Wrong

Page 23

by Frank Supovitz


  When I first learned about the accident, I didn’t think to ask the caller whether he knew that the bollards had pierced the body of the car, or whether he only believed that because someone else had told him, but I should have. Not that it would have changed the outcome, but now, when I hear about a problem, especially a big one, I ask the questions: “Do you know that for sure? Did you see or hear that yourself, or hear it from someone else?” Verifying the accuracy of information is essential to fixing the right problem or managing a response.

  Chasing Ghosts

  When faced with a challenge, my former colleague Bill McConnell often wondered out loud whether he was “chasing ghosts,” that is, acting on something he believed was there, but really wasn’t. If you haven’t witnessed the problem yourself and you can’t immediately verify the details, sometimes you have to err on the side of acting as though the ghosts are real. Our security team, for instance, was regularly confronted with packages or backpacks sitting unattended at one of our event sites, either in plain sight or discovered hiding behind a door or garbage can. Thankfully, in every case of which I am aware, the lone item was an innocent box filled with merchandise, supplies left unintentionally by a careless worker, or a backpack of work materials temporarily stashed in a hiding place to keep it from being stolen.

  On April 15, 2013, however, an unattended backpack left near the finish line of the Boston Marathon was anything but innocent or unintentional. Two pressure-cooker bombs exploded, killing three people and injuring hundreds. This occurrence demonstrates the importance of chasing those ghosts every time because there is a possibility that lives, or safety, are at risk. The best outcome is an entirely wasted effort because it turns out to be nothing. The next best outcome would be quickly taking whatever decisive actions are necessary to protect the safety of everyone in the vicinity.

  In most non-safety-threatening situations, however, chasing ghosts is unproductive. We may have to do it sometimes because the risks of not acting on some possibilities might be economically or reputationally damaging. For instance, it may be prudent to hold off on launching a new campaign because of unsubstantiated reports that the celebrity hired as a spokesperson was involved in a controversy or a criminal act. The reports may turn out to be based on nothing, but proceeding without cautious investigation would be unnecessarily and dangerously risky.

  That’s why it is essential to verify as many details as we can in the opening moments of an incident or crisis, helping us identify what problem we are really trying to solve. When 125 people reportedly arrived with tickets that failed to scan at the security checkpoint, we could have been facing a failure of our ticket-scanning equipment, a corruption or loss of connectivity to the database of ticket bar codes, an attempted security breach, or a counterfeit ticket problem. Today, I can’t say for sure whether there were 125 people really waiting in the rain because I didn’t ask whether the Gate Supervisor had encountered the entire group or had only interacted with the “tour organizer,” who may have trotted out the story after his tickets were rejected by the system. Rather than spending time chasing ghosts trying to ascertain whether the scanning system was at fault or defeating the purpose of scanning tickets in the first place by waving through some number of wet and allegedly unhappy people through the gates with our apologies, we tried to catch the ghost by visually authenticating the tickets. In the meantime, as ghosts are wont to do, the problem vanished.

  DIAGNOSING SYMPTOM VERSUS CAUSE

  After verifying the details, the next component of identifying the problem we are trying to solve is understanding whether we are dealing with the root cause, or just a symptom. Chris Barbieri was working in the information technology (IT) department of a bank on a Monday afternoon when he and his colleagues became aware that its servers were infected by an insidious software virus. A computer virus can be pretty disruptive for any business, but it can have far-reaching effects on a financial institution that go well beyond simple inconvenience. An inability to access customer data or process transactions can profoundly affect the well-being of not only the bank, but also of the businesses and depositors the bank serves, resulting in significant reputational and financial damage. The problem seemed isolated to just one of the bank’s many technology systems and the team moved quickly to restore it to a fully functional condition before the day had ended.

  By Tuesday morning, however, it was clear that the virus they thought had been scrubbed from the affected area had been working overnight, infecting virtually every software system in the bank. As more and more symptoms developed, the IT team was under enormous and continuous pressure to keep the bank running without interruption, and with a minimum of inconvenience to its customers. Having acted quickly to knock down one problem, only to have new symptoms develop literally overnight, the IT team understood that the issue they considered conquered was just one symptom of a much larger and more sinister root cause. A virus had been launched to purposefully wreak maximum havoc on their business. Barbieri recollects a frustrating game of whack-a-mole as the team struggled to resolve one symptom, only to have new ones spring up over the course of the week. “What exactly is the problem, what and who did it affect, and how do we contain it?” Those were the most important questions Barbieri and the team needed to answer as quickly as possible.

  PRIORITIZING RECOVERY

  After the fact, Barbieri and his colleagues undertook a root-cause analysis to determine how and why the attack happened, and how they would keep it from happening again in the future. (See Chapter 23.) Analyzing precisely why they were vulnerable, however, was secondary in the urgency of time to addressing each of the symptoms as they arose. Since they still had to support the bank’s overall business, the team delegated the recovery efforts to the people who could best fix each of the problems as they emerged, and everyone else returned to their day-to-day functions. The business could not suffer from the secondary, unintended effects of all technology resources being focused on the virus attack.

  The bank’s IT team identified the most important things they had to do:

  • Actively engage in recovery efforts for each affected system as they occurred

  • Continue the uninterrupted delivery of essential bank services to its customers

  The business chose not to allocate all of its specialized resources to solving what had gone wrong, so it could continue to serve the needs of the bank and its customers. Also, it did this to avoid unintended consequences that could arise while everyone else’s attention was diverted to the crisis.

  Often symptoms must be addressed faster than the root cause. At 5:00 a.m. on Sunday, December 12, 2010—the day of a scheduled game between the Minnesota Vikings and the New York Giants—the snow that had accumulated atop the Minneapolis Metrodome during a powerful blizzard proved to be too much for the air-supported roof to withstand. Crews had been working throughout the storm, using steam and hot water, to clear away as much snow as possible until heavy winds threatened to sweep the crew off the roof. The building’s heating system applied warmth from below and more hot air was fed between the two layers of the fabric roof in an effort to melt the snow collecting above. Thanks to the deeply cold temperatures, however, snow continued to pile on top, as deep as two feet in valleys between the roof panels.

  Our phones rang a few minutes after the roof failure, alerting us of an imminent all-hands conference call. We weren’t jumping on the phone to plan how to fix the Metrodome roof or to investigate why it gave way. We were going to discuss where and when the game scheduled for that afternoon was ultimately going to be played, and to start planning where the final Minnesota Vikings home game would be staged a couple of weeks later.

  Similarly, the first and most important thing during the Super Bowl XLVII power outage, on February 2, 2013, was not turning the lights back on. It was to take steps to avoid panic setting in among the fans in the half-dark stadium. That’s why one of the first things we did was to ensure that the public address system was
activated and that we knew what we were going to say to the crowd.

  Transitioning to Recovery

  Superdome executive Doug Thornton was fielding reports from his engineering team assessing the cause of the electrical malfunction:

  DOUG: “Frank, we lost the ‘A feed.’ ”

  FRANK: “What does that mean?”

  DOUG: “That means we have to do the bus tie.”

  FRANK: “What does that mean?”

  DOUG: “That means about a 20-minute delay.”

  Doug and I had discussed the recent upgrade to the electrical system during the planning stage, so I knew what the term “bus tie” meant. The backup cable installed by the power company would have to be connected into the side of the building that was now in darkness. What I really wanted to know was how long it was going to take, and Doug was ready with an answer that proved remarkably accurate. I didn’t realize it at the time, but Doug and his operations team had been actively working through the recovery process from the moment the lights went out, and not a moment too soon.

  21

  MANAGING RECOVERY

  If it was more than a minute-and-a-half that went by, it wasn’t by much. The New Orleans Police Department (NOPD) captain at NFL Control stepped in front of me, the tips of his shoes touching mine, and whispered softly. “It’s not a terror or cyber-attack.” Well, check that one off the list.

  While I was talking to Doug Thornton—the Mercedes-Benz Superdome senior executive during the Super Bowl XLVII blackout—and calling for the public address announcement, the NOPD, the NFL’s security team, and the full complement of law enforcement agencies working under the Department of Homeland Security (DHS) wasted no time comparing intelligence and assessing the possibility of foul play. There was nothing to suggest it was. Now that we knew for sure that it was safe to tell people to stay in the stadium, one of our teammates ran the hastily scribbled script down to the public address announcer.

  We had already determined the power outage was confined to the stadium. No part of the city’s electrical grid had failed. NFL Control had no view outside the building and the outage had disrupted the checkerboard of video feeds from the surveillance cameras ringing the outside of the building, but security guards out there confirmed that the lights in the surrounding area were still shining. It was hopeful news that the problem was localized to the stadium and that we were not hostages to a larger and infinitely more complicated problem.

  Doug and his team had an existing protocol for full or partial power failures and reported that they were shutting down all nonessential equipment on both sides of the building—the side that was still illuminated and the side that was dark. The Superdome had always had a redundant power system to start with, even before the backup cable was installed months before. If one feeder failed, the entire building could be powered from the other. This required reducing overall power consumption, and a list of candidate systems to shut down was already in place to accommodate that.

  The Super Bowl, however, required a great deal more energy than a normal football game, hence the precautionary decision was made to install the backup cable. If you were at the game and it started to feel just a little bit stuffier, it was because the air conditioning was shut down. Walkway lighting in the unaffected side of the building was reduced. Refrigerators, electric cooktops, sponsor displays, escalators, and other nonessential power drains were in the process of being powered down.

  Throughout the 24 minutes it would take for the dark side of the stadium to be completely re-energized and the field to be lit to full brightness, NFL Control was a beehive of activity. We continually asked ourselves and each other:

  • “What are we doing?”

  • “What still needs to be done?”

  • “What should we be doing?”

  This time, “What can I do to help?” came from a new voice in the room. It was Eric Grubman, NFL executive vice president of business operations, who was at that time my direct superior at the League. He had run over to NFL Control from one of the League’s hospitality suites. Eric had been a U.S. Navy submarine officer and former energy company president, with more than a passing familiarity with electrical systems.

  “It’s not a safety or security issue,” I assured him. “One of the two feeders into the building shut down.”

  “We’re heading down to meet the engineers at the switchgear vault,” Doug added. That is where the two feeder cables go before they enter the building. Eric departed with the contingent of people who left to inspect the equipment, and left us to continue managing and monitoring the overall response.

  If we had any uncertainties about how the blackout was going to play out, at least we could see what was happening on the field from our vantage point built on the top rows of the upper deck. There was understandably more confusion in the windowless CBS broadcast truck, where producer Lance Barrow and director Mike Arnold were scrambling for information, partially blind and almost totally deaf to what was happening on the field. They could only see patches of the activity on the long bank of monitors lining the long wall of the expanded trailer, most of which had turned gray. We couldn’t tell them anything, even if we wanted to, because all connections between NFL Control and the broadcast truck had been cut off. The broadcast was still being fed to the outside world, but CBS could only show images from the 15 of 62 cameras that were still functioning.

  Play-by-play announcers Jim Nantz and Phil Simms, from their broadcast booth overlooking the field, could have painted a verbal picture of what they were witnessing, but they were also on the dark side of the stadium and their equipment was dead as well. “The network,” as Armen Keteyian told it in his 60 Minutes Sports report, “was flying blind.” Barrow eventually established indirect communication with sideline reporter Steve Tasker through a cameraman who could still hear his instructions. Although neither Tasker nor fellow reporter Solomon Wilcots could hear the CBS team in the truck, their microphones allowed them to report from the field while the network reactivated their pregame set on the sideline.

  Social media proved to be the primary source of information (and disinformation) to people in the stadium, viewers watching at home, and to the world-at-large about what was going on at the Superdome. While CBS was reestablishing more meaningful contact with their viewers on television, fans in the stands were communicating with their friends and families on Facebook and Twitter, and they were quick to tweet about the eerie atmosphere in the stadium. Twitter reported an increase from 185,000 tweets per minute when Jacoby Jones scored his 108-yard touchdown to an average of 231,500 during the 34-minute game delay.

  The problem for fans was separating facts from the rumors, and news from satire. The volume of the latter was multiplying by the moment. Once it was determined that there was no life-threatening disaster developing in New Orleans, the world could have a laugh at our expense. In two successive tweets, Walgreens Pharmacy posted that they carried candles . . . and lights. Tide detergent bragged that it “couldn’t get the blackout, but could get your stains out,” Jim Beam Black® promoted itself as the whisky sponsor of the blackout, and PBS encouraged viewers to switch to their concurrent airing of Downton Abbey. Nike congratulated Jacoby Jones for his “lights out speed” on the kickoff return, but it was Oreo cookies that was acknowledged as the Super Bowl retweet champion, comforting distressed fans that there was “No problem. You can still dunk in the dark.”

  While clever brands enjoyed a hearty chuckle and leveraged an unparalleled opportunistic marketing bonanza, the operating team stayed entirely focused on getting the power restored to the stadium and dealing with any issues that flowed from the outage. The electrical engineers were still working on the bus tie. The stadium crew was rescuing people trapped in elevators. And, one shaken Superdome electrician, I was later told, was being strenuously reassured that his work repairing an outlet somewhere in the building was not responsible for the crisis. (To this day, I don’t know if that was an apocryphal tale, but i
t is too delectable not to share.)

  IN THE HEAT OF THE MOMENT, BLAME IS NOT IMPORTANT

  One of our media relations representatives tapped me on the shoulder to relay that Entergy, the electric utility, had already taken to Twitter to point a definitive finger of blame toward the Superdome. The power they provided to the stadium, they tweeted, had not been interrupted. The problem, they said, was on “the customer’s side.” The Superdome was reportedly preparing to send out their own dueling Twitter post that squarely placed responsibility for the failure on Entergy’s side.

  I was asked by our PR team: What did we want to say? “We’re just trying to get the plug back in the wall,” I answered. “Let the two of them duke it out.”

  There would be plenty of time afterwards to uncover the root cause and contributing factors, and if the power company was blaming the stadium and the stadium was blaming the power company, let them. No one was blaming us, at least not yet. The conspiracy theories would later mushroom on social media after play resumed, as the San Francisco 49ers threatened to even the lopsided score during the second half. Worrying about where to assign blame was a wasted effort, and the only information coming from us should be: “We’re going to get the lights back on, we’re going to play the rest of the game, and we’ll let you know when we can expect that to happen.”

 

‹ Prev