by Pawel Motyl
In the 1990s, NASA was struggling with two basic problems. First, following the collapse of the Soviet Union, the importance of many of the projects being carried out by the Agency diminished, because they were less likely to be needed in a military capacity. This was reflected in the dwindling generosity of Congress, which proceeded to turn off the flow of funding. Second, budget restrictions and management failures meant the flagship program of building an orbiting space station fell further and further behind schedule, to the increasing irritation of the authorities, the media, and, of course, NASA’s own managers. This created a vicious cycle, an authentic Catch-22 where, due to lack of funds, the project fell behind schedule, which became a compelling argument for providing even less funding, which put the project even further behind schedule, and so on. Two successive NASA administrators, Daniel Goldin and Sean O’Keefe, tried in various ways to cope with the budget restrictions. To save money, a partnership with the European Space Agency (ESA), Russia, and Japan was initiated, with all those partners becoming involved in NASA operations.
Both Goldin and O’Keefe put more and more pressure on the Agency’s staff during their time as administrators. On assuming his post in April 1992, Goldin said that if the NASA budget was cut back and they could do only one thing, that thing must and should be figuring out how to cut the cost of going to space by a factor of a hundred and improve the reliability of going to space by a factor of ten thousand. The organization lived by the slogan “faster, better, cheaper,” and indeed, Goldin turned out to be very effective at cutting costs. Over time, “faster, better, cheaper” became a mantra that impacted the decisions and actions of NASA managers. Sean O’Keefe, who replaced Goldin in December 2001, also stressed the importance of the schedule, telling his directors that at the very least they needed to a fix a date by which they would have the components sent up to the space station. Dr. James Hallock, an expert who analyzed the operations of the Agency during this period, pointed out the dangerous disparity between what was declared and what was possible. According to his calculations, to carry out the project according to the schedule set out by the incumbent administrator, NASA would need to have at least five successful space shuttle missions every year for the foreseeable future. The problem with this, though, was that five successful liftoffs per year was a massive challenge—during the STS program’s lifetime, sometimes only two or three missions per year were achieved.
So, NASA entered the new millennium as an organization that had taken practically no steps to alter its culture. The Agency became an increasingly commercial outfit, demanding more and more obedience from its employees and insisting that they adhere rigidly to their designated tasks. They had to worship at the altar of financial parameters, deadlines, and public image.
The failure to learn real lessons from the Challenger tragedy, combined with the total lack of change in the Agency’s culture and the increasing pressure from deadlines and a need for results, planted a ticking time bomb.
It finally exploded in 2003. 11 The first planned mission that year was a trip by the oldest of the space shuttles, Columbia, mission number STS-107, with a crew of seven, led by Rick Husband. The mission’s aim was to carry out a series of experiments, including some on the Spacehab transport module. This flight, too, was delayed several times due to technical issues. It took off on January 16, five days later than scheduled. Although this time liftoff took place in good conditions, analysis of the film record of the flight’s initial phases revealed an unsettling anomaly. Eighty-two seconds into the flight, a 1-pound fragment of isolating foam broke off, hitting the leading edge of the left wing. The incident occurred at an altitude of almost 12.5 miles, when Columbia was traveling at nearly 1, 900 miles an hour. The strike was an undisputed fact, clearly recorded by the cameras. The engineers working on the Debris Assessment Team (DAT) were concerned, and after an internal discussion, they reported their worries to Mission Control, led by Linda Ham. Management’s reaction was startling: the DAT report aroused little interest and the managers seemed unconcerned by the problem. They cited three arguments. First, the engineers had no hard data to prove the wing had actually been damaged. Second, such incidents had happened before without serious consequences—long-standing employees pointed, among others, to missions STS-7, STS-27, STS-32, STS-50, STS-52, and STS-62, in which isolating foam also fell off the booster rockets. The last argument used by management was nothing short of shocking. Reportedly, Ham, in conversation with engineers, said that the incident was not a factor which they were going to take into consideration, because even if the foam had damaged the wing, there was nothing they could do about it. In the face of such arguments, the engineers eventually capitulated. They tried to illustrate the risk by using a computer simulation, but they weren’t able to demonstrate categorically that the foam had caused damage.
The Columbia mission lasted a total of sixteen days; on February 1, 2003, preparations began for the shuttle to land at Edwards Air Force Base. At 8:10 am, the shuttle’s captain was given permission by the Entry Flight Control Team to begin the re-entry maneuver. Thirty-four minutes later, the shuttle re-entered Earth’s atmosphere, its outer surface heating up due to friction. Within five minutes the temperature had reached over 1800°F, which is typical on re-entry. At 8:53 am, while speeding along at Mach 23, Columbia flew over the coast of California at an altitude of over 43 miles. The outer surfaces were glowing at a temperature of 2650°F, about two hundred degrees less than normal. The shuttle began to disintegrate around five minutes later, when tiles from the thermal protection system (TPS), damaged by the strike, began to fall off. The TPS is there to protect both the shuttle and the astronauts from the high temperatures generated on re-entry. One minute later, the Landing Control Team alerted the shuttle command that they had lost telemetry data from the sensors in the damaged wing. At precisely twenty-eight seconds before 9:00 am, Rick Husband replied to Mission Control, “Understood, but...”
The sentence stopped midway, just like the other telemetry data.
As in the Challenger disaster, an investigative commission was set up—the CAIB, or Columbia Accident Investigation Board—to conduct a series of analyses to uncover the cause of the tragedy. Just as they had done seventeen years before, the experts highlighted two separate problematic areas. The direct, technical cause of the failure was obviously the damage to the TPS tiles on the wing by the insulating foam, which meant the wing had no way of surviving re-entry. The second cause had a management and organizational perspective. As one of the CAIB members, General Duane Deal, put it, “The foam did it, the institution allowed it.” 12
The report published by the board runs to several hundred pages and is one of the most shocking studies on decision-making inertia, avoidance of responsibility, poor organizational culture, and lack of leadership that I have ever encountered. The mechanism that killed the astronauts on mission STS-107 was identical to the one that ended in the tragedy of STS-51L. This means it could have been avoided in at least two ways: either by learning the lessons from 1986 and introducing specific changes to the decision-making procedures on the STS program, or, seeing as this wasn’t done earlier, by learning from them during the sixteen days Columbia spent in orbit. The CAIB experts did not beat around the bush: they noted that, unlike the Challenger catastrophe, where, once the shuttle had left the launch tower there was nothing anyone could do, in this case NASA had over two weeks to try to rescue the crew. Seven people might have been saved had the Agency’s management entered inquiry mode and admitted that it was dealing with a serious problem that demanded unconventional, unforeseen actions. Seven people could have survived if NASA had retained even one iota of the attitudes and behaviors that Gene Kranz’s team had demonstrated in 1970.
The results from the CAIB report can be divided into several main categories that reflect the definitions of the classic decision-making errors I’ve described so far.
Wrong Decision-Making Mode
&n
bsp; The first group of mistakes center on how the unexpected event of the insulation foam hitting the wing was treated. This was recognizably a black swan, because while foam had fallen off before, it had never previously struck the leading edge of the wing. The DAT engineers tried to go into inquiry mode (by asking for photos of the wing to assess the condition of and any damage done to the TPS tiles), but they were refused permission by Mission Control. The managers ignored the threat, accepting only information that confirmed it was safe to continue the mission. They even cited experiences from earlier flights in their arguments (turkey syndrome).
It is notable, though, that even the worried DAT engineers failed to fully go into inquiry mode. As they testified during the investigation, their basic problem was a lack of evidence to support the threat’s existence. With the recordings available to them, and even with the computer simulations, they weren’t able to demonstrate the threat to the managers, and after having their requests turned down three times, they gave up trying to look for evidence. Meanwhile, the CAIB investigation quickly revealed that the necessary data could have been obtained from at least two sources. First, it would have sufficed, as part of an inquiry approach, to conduct a detailed analysis of the shuttle’s construction parameters, which is where Dr. James Hallock, a CAIB member, uncovered some surprising data. The TPS tiles were unusually fragile: they were supposed to withstand a strike equivalent to a pencil’s hitting the wing from a height of 6 inches. It’s easy to imagine, then, the damage that must have been caused by a 1-pound piece of insulating foam moving at a speed relative to the shuttle of over 500 miles an hour. Second, it was possible to demonstrate the threat by the simplest of tests, such as the CAIB carried out. They took a spare wing from NASA’s stores and fired a similar piece of foam at it, essentially replicating what happened during liftoff. The results of the test were shocking—the foam didn’t just damage the thermal isolation tiles, it actually put a hole in the wing. An even simpler test was carried out by 1996 Nobel Prize Winner Douglas D. Osheroff, who tested the foam in his own kitchen. He showed that under high temperature and pressure, the foam simply peeled off, so it could have been shown much earlier, and in a very simple way, what risk this posed during liftoff.
This all shows the power of organizational culture. It turns out that abandoning inquiry mode didn’t only occur at the level of managers interested in continuing the mission. It also happened with the engineers responsible in large part for the astronauts’ safety. The work of the CAIB demonstrated clearly that the DAT had simple tools and solutions to hand with which they could have demonstrated their concerns to management and so forced through a different approach.
It’s no accident that people say the most effective killers of creativity and innovation in any business are the pressure to stick to existing plans and deadlines, and to keep an eye only on the bottom line.
Lack of Teamwork and Psychological Safety
During the testimony given by the DAT engineers, one other statement worthy of deeper reflection was made. Rodney Rocha, the leader of DAT, when asked why he didn’t try to apply more pressure to mission control, said that he was too low on the corporate ladder to challenge any decisions made by those on the upper rungs.
It’s quite startling that the head of a team crucial to the safety of a mission would feel so far down the hierarchy that he was afraid to openly express his opinions to Mission Control. In 1970, NASA had had a culture of open discussion and professionalism. How had it become a place where a competent and experienced man, vital to the safety of the astronauts, felt he could not communicate openly with his superiors?
James Bagian, a former astronaut, didn’t hold back when he commented that in the 1990s, NASA did not tolerate disagreement and people knew that if they wanted to survive in the organization, they had to keep their mouths shut. Torarie Durden, who was a member of one of the teams of experts, is also on record as saying that NASA was exerting pressure on staff to meet their deadlines, no matter what it took.
There is no doubt that NASA failed to build psychological safety into its environment. Over the years, a silent process of cultural conformity took place at all levels. Senior staff expected only that deadlines were met and projects were cost-effective; any teams or individuals that challenged this were gradually moved off a project so they wouldn’t upset the apple cart with their rogue opinions. From this perspective, the DAT was a trouble-maker, forcing management to revisit their preconceived positive scenarios, which in turn resulted in the sidelining of the engineers from the risk assessment process (just as the Morton-Thiokol experts were sidelined in 1986). Subsequently, changes in the organizational structure, carried out in the 1990s, saw the status of the DAT reduced to below that of flight controllers, previously seen as their equals.
Lack of Leadership
However, even in such a flawed organizational structure, you can still encourage discussion, especially if it is vital to the quality of the decision-making process, as JFK demonstrated when he created EXCOMM. An essential condition for this is excellent leadership—the presence of people who are aware of the threats and risks, and at the same time have the courage to confront the given circumstances or cultural stereotypes and to abandon routine behaviors. Kennedy was just such a person, and apparently Gene Kranz was too.
Peter Drucker once said that “management is doing things right, leadership is doing the right things.” It would therefore be reasonable to conclude that while NASA had excellent managers in 2003, they were sadly lacking in leaders. The Agency’s management tried to do things right, meeting deadlines and delivering the expected results. However, there was no real leadership which, as per Drucker’s definition, is what you need at key moments when the right decisions are crucial.
Linda Ham, flight director of STS-107, had an impressive career history with NASA. She joined the Agency as a talented, able twenty-one-year-old, and thanks to her stamina and hard work, she climbed the rungs of the organization to the very top: in May 1991, she became the first woman in NASA history to direct space flights, and less than a year later, she was the flight director on the STS-45 mission. Ham aroused ambivalent feelings in her co-workers. No one questioned her competence or decision-making boldness, but some pointed out that she could be quite authoritarian.
During the Columbia mission, Ham took a range of decisions that were later heavily criticized in the CAIB report. Among the most serious errors were disregarding and later overruling the opinions of the DAT engineers, refusing the request for photos of the wing to be taken using military satellites, and fostering an attitude of powerlessness and the impossibility of carrying out repairs in the team below her. Distancing Mission Control from the engineers responsible for flight safety was nothing new, though, and neither was faulty communication between teams; they certainly weren’t specific to the Columbia incident. The weak communications were also rooted in the lack of leadership in NASA and are illustrated quite well by a brief exchange between Linda Ham and one of the members of the investigating commission:
CAIB: As a manager, how do you seek out dissenting opinions?
Ham: Well, when I hear about them...
CAIB: But, Linda, by their very nature, you might not hear about them.
Ham: Well, when somebody comes forward and tells me about them.
CAIB: But, Linda, what techniques do you use to get them? 13
To that question, Linda Ham gave no reply and the room fell silent.
After the final CAIB report was published, Ham was openly criticized, not only by the media but also by some politicians and members of the public, and she paid a price both professionally and personally. In 2003, she was demoted and sent to the National Renewable Energy Laboratory in Golden, Colorado; in the same year, she divorced her husband, the astronaut Kenneth Ham.
Analyzing her role in the STS-107 mission, Ham admitted in an interview with a local newspaper:
If people say
there are problems with the NASA culture, I will admit that I am part of it. I grew up there. I was there when I was 21 years old and spent 21 years there. So besides growing up in Wisconsin, the only other thing I ever knew was NASA. 14
The fact is that flight control took a series of wrong decisions that were also reiterations of mistakes made in earlier years. Some of the decisions, attitudes, and behaviors were quite remarkable. Taking NASA’s history into consideration, though, together with the system of cultural pressures constructed over the years, it’s hard to place the blame on a single person. CAIB member Diane Vaughan has noted that she believed the problem was most definitely both a cultural and also a structural one, and that if the culture and structure do not change, anyone who joins that organization will inevitably adopt the same behaviors as their colleagues.
I don’t think any words could express the essence of the Agency’s problem quite as briefly, simply, and clearly. Those in management positions were either raised in NASA’s specific culture, and so couldn’t become the instigators of cultural change (Linda Ham), or were very rapidly instructed in the prevailing manner in which decisions were taken and risks managed, as well as in how to behave in the organization. So a change in personnel didn’t really change anything. What’s worse, the lack of leadership didn’t just affect the Agency. There was nobody to be found in the whole of the USA who perceived any kind of risk emerging from the lack of a clear vision for the development of NASA (since Kennedy’s time, nobody had set long-term goals for the Agency), no one in Congress or successive presidential administrations saw any danger in the deterioration of the organizational culture. The twin pressures of deadlines and constant budget cuts meant that dollars were worth more than astronaut safety. There were no true leaders.