by Paul Dye
Of course, they were just needling me back with the Burn PAD. But I figured that if it kept them busy, it would be fun to put the classic Apollo maneuver into Shuttle terms. “Sure FIDO—let’s see the PAD!”
Now you have to realize that a fully loaded Orbiter, with payload and consumables, weighed in at around 230,000 pounds when it got to orbit—so the weight of fuel to send the thing to the moon was going to roughly equal the entire all-up weight of the bird—clearly impossible. But it was fun to have that idea rolling around on our minds.
Sure enough, in about an hour, I had a nice Burn PAD to add to the crew’s morning mail—the “STS-127/Apollo 11 Memorial Burn PAD.” It had a time of ignition, attitude, targets—the whole deal. And knowing the precision and integrity with which our Trench guys worked, I trusted that they had actually run it through the computers and not just made it up. Now there were practical problems, of course—the burn duration with the Shuttle’s OMS engines was going to be more than a single revolution of Earth. That was not only impractical from an engineering standpoint, it was going to affect the orbital mechanics as well, because burns were usually targeted to a point in time. But it was fun nevertheless.
We did, of course, put a note on the top of the PAD, “DO NOT EXECUTE WITHOUT AN EXPLICIT GO FROM HOUSTON,” just to make sure that we didn’t joke ourselves into a sticky situation. I didn’t want to be the one who had to explain to management why we ran the Orbiter out of gas trying to get it to the moon…
Wake-Up Music
Wake-up music has been a tradition in the NASA human spaceflight program since astronauts were allowed to go to sleep on orbit—probably on the long-duration Gemini flights. Someone got the idea to wake up the crew with a bit of music played over the air-to-ground channel, and astronauts all being pilots (at the time), and pilots being pilots, the music chosen was often the worst tonal atrocities that could be found—often country and western twangs that could annoy just about anyone. The wake-up music was chosen as a joke by the (astronaut) CAPCOMs to annoy their brethren in space, and was a good source of levity within the team.
No one is sure when it actually became standard procedure to wake up every crew on every morning with music, but it was certainly a standard by the time the Shuttle came along. The problem was (and is) that coming up with songs every morning can get tough when you’ve done it over, and over, and over, and over… It’s easy to get stale.
By the time I was selected as a Flight Director, it was standard procedure for the Planning Shift CAPCOM to choose the music and make sure that it was ready to play. Since the CAPCOM and Flight Director worked side by side, it was not uncommon for the Flight Director to get involved with the selection—to the relief of CAPCOMs who were starting to run out of ideas. Eventually, the CAPCOMs solved this problem by turning it over to the crews themselves and letting them pick the music they wanted to hear. It was the death of the old days, when the idea was to bond the flight control team and the crew by annoying each other (it was an odd way to bond, but it seemed to work). As Shuttle flights became more routine, it became harder and harder to find annoying music, so this change was probably inevitable.
We still found times to be inspirational and unique, however. There was one early morning when we received word that Charles Schulz, the noted cartoonist and creator of Charlie Brown, Snoopy, and the entire Peanuts gang, had passed away overnight. Schulz had wormed his way into the hearts of NASA’s human spaceflight team long before, back in the 1960s. His legacy survives today in the most meaningful award a space worker can receive in recognition of sustained high performance—the Silver Snoopy. The award can only be earned once, and not by a manager. It is presented by an astronaut, and it rewards those who make human spaceflight safer and more successful. Schulz was an icon, and his passing saddened us all.
I recall that we heard of his death just a couple of hours before a crew was scheduled to wake up, and while I have no idea what music we were intending to play that morning, I knew what I wanted to do. I asked the team if anyone might possibly have a recording of the Peanuts music—music composed and performed by Vince Guaraldi for the first TV special about the Peanuts gang. True to the can-do nature of our teams, someone piped up that they had a CD back in their office and had sent someone to get it. We quickly had the comm techs in the basement get it cued up, and we woke the crew with the sounds of the soft jazz piano that all of us had grown up with at Christmastime. It was proof that you didn’t need to annoy someone to make wake-up music meaningful.
Chapter 10
Return to Flight
It’s never one thing that causes a catastrophic failure—or more accurately, it can be one little thing that doesn’t come into play until a whole chain of events happen that make for a catastrophic failure. In modern mishap investigation, this is called the error chain. If we are to understand the cause of a major failure, then we have to follow the error chain from the small individual failure (the proximate cause) to the root cause—the thing that caused the little failure to become a catastrophe. If, for example, you conclude your investigation by saying that a loose bolt took down an airliner, then you really haven’t solved anything. If you figure out mechanically why the bolt was loose, then you are closer to the answer. But to really figure out how to prevent the mishap from happening again, you need to keep asking why until you find the root cause for the loose bolt. Usually, that comes down to a human error that can be prevented by some measure. That is accident investigation in a nutshell; and while it sounds easy, it can take a very long time if the causes of the mishap were complex.
We lost two Space Shuttles, and their crews, in the course of the program. Challenger went down on ascent in 1986, and the Columbia was lost during entry in 2003. Both mishaps were traced to mechanical failures fairly quickly. I recall first hearing that a leak in a solid rocket booster was the proximate cause of the Challenger loss within hours of the failure. The specific cause of the Columbia breakup took a few days longer, but it was suspected very quickly to be a failure in the leading-edge thermal protection caused by a piece of foam breaking off the External Tank on ascent. While it was good to know both of these things, neither of these facts, taken alone, were sufficient to tell us what really happened so that we could find ways to prevent them from happening again.
In both cases, it took nearly two and a half years after each occurrence to return the Shuttle to flight. In neither case was any of that time wasted. While the engineering community invested a huge amount of time and energy into fixing the physical causes of each mishap, the operational and management communities spent an equal amount of energy working toward finding the root causes—and those causes were deep and complicated. People problems usually are.
On the morning of January 28, 1986, I was standing in the lobby of JSC’s Building 4, the office building that housed system flight controllers, the training organization, and the Astronaut Office. I was on my way to a meeting to discuss procedures for the upcoming Astro-1 mission (an astronomical science mission that used the Spacelab IPS for which I was responsible), and I stopped in front of the television, along with several dozen others, to watch the final few minutes of the countdown for STS-51F. I had paid some attention to the launch process, but I hadn’t concentrated on it—I was a young flight controller with the next mission to think about. But when I saw those twin plumes of exhaust as the Solid Rocket Boosters (SRBs) took off on their own, leaving an expanding cloud of gas and debris where the Orbiter and External Tank had just been, I realized quickly that we weren’t going to be launching another Shuttle anytime soon—much less Astro-1. Our lives changed that day—and the program was never going to be the same.
We’d lost a crew, and at the same time, a vehicle. The vehicle was eventually replaced, but those seven lives could never be brought back. We all knew that this was more than a failure of technology. We needed to figure out how we got to the place that had let us think that it was okay to launch when we didn’t really know how cold temp
eratures could affect the O-rings that kept the hot gases inside the SRBs. There was a serious flaw in the decision-making process, and it had to be solved—as well as anything else in our operational processes and rules that were affected by the same culture.
It was culture that got us in to trouble—culture, success, and the pressure to perform. We fell into the trap of success. With many diverse and complex missions launched in a short period of time, we felt that we knew what we were doing. We felt comfortable because nothing terrible had happened. But what we didn’t realize was that the terrible thing just hadn’t happened… yet. Simply put, we had lulled ourselves into thinking that since we had gotten away with it before, we could get away with it again.
I like to point out to people that this is like playing Russian roulette. In this macabre “game,” you place one bullet into a revolver and spin the chamber. Assuming it is a six-shooter, you now have a one-in-six chance that a live round is lined up with the firing pin and barrel. You point the barrel at your head, pull the trigger, and see what happens. If it doesn’t go off, you managed to win—you had five out of six chances of doing so. Now the smart person would have never played the game at all. But let’s say you had no choice and had to play. Say the live round didn’t line up when you pulled the trigger and you were safe. If you were smart, you would simply put the gun down and walk away, saying that you won and would never play again.
But what happens if you are not thinking it through? What happens if you think, “Well that wasn’t so bad—I guess I should try it again!” Well now you are on your way to disaster because every time you pull the trigger without firing the round, your chances of getting killed on the next trigger pull go up—not down. Anyone who survives on luck and wants to live a long time should take that luck and get themselves out of the game of chance. People in the flying business—either atmospheric or in space—are taking enough chances as it is, even when they are doing everything right. Relying on luck is simply not an acceptable way to fly.
But our culture at NASA had developed into one in which we couldn’t fail—that we couldn’t allow ourselves to fail. That is a good thing if you are talking about actual in-flight operations: we would not allow ourselves to fail to bring our crew back. But when we extended that to launch decisions, which really are decisions on making a mission successful (you can’t have a successful flight if it doesn’t launch), that is where we got into trouble. Flight Directors are responsible for crew safety and mission success—not one or the other. That means we have to find solutions that satisfy both goals. Oftentimes, you must cancel a launch so that you have an opportunity to launch the mission another day. And by not launching when conditions aren’t right, you ensure crew safety as well. Never launching means that the crew will be safe (or at least won’t be lost in space—they could still get hit by a bus), but you never achieve mission success. Launching with the maximum opportunity for a safe and successful outcome is the key, and that got lost in our culture at the time of the Challenger crash.
But to a newly minted front room flight controller, all of that was beyond my ability or skill to affect. For me and my peers, the downtime after Challenger was filled with day after day, week after week, and month after month of detailed work on flight rules, procedures, and training. After a couple of years of continuous Shuttle flights, with one following the other with sometimes less than a month in between, we needed to be able to stand back, take a breather, and fix some of the things that we knew were less than perfect. Even though there had seemed to be an endless amount of time to get things “right” before the first Shuttle flew, until you get a flight program going, you don’t know what you don’t know. As a result, you might generate a lot of “stuff,” but it might not be the right stuff. Put another way, we didn’t have the resources to keep improving our operations and operational products and keep flying at the same time. The downtime after the crash allowed us to catch up.
Shortly after the Challenger accident, management and I decided it was time for me to move from the Payload Support section into something more mainstream. I then transferred to the Mechanical Systems Section to support Orbiter systems (and a whole lot of other things). As such, I first had to learn a lot about the systems as they were. Then I quickly became involved in modifications that were being made, primarily to the landing and deceleration systems for the vehicle. These included the tires, wheels, and brakes to start. Then there was the nosewheel steering system, which had never worked properly. It was therefore being upgraded to eliminate some single-point failures. These single-point failures were places in the design where if one problem occurs, you have no backup, no recourse—things could be fatal immediately, with a loss of control on the runway. Then we had the challenge of coming up with a new way to relieve the (undersized) brakes of their stopping task, as well as relieving loads on the nosewheel. The answer was a drag chute—and that required a huge amount of operational development as well as design work.
Every group within Mission Operations had similar improvements going on—upgrades to computers, changes to various valves and plumbing configurations, and countless wiring changes being done throughout the Orbiter. Every team was taking advantage of the break to fix all the things that needed fixing but hadn’t had time to work on. This by no means meant the Orbiter was a lemon or had a poor design; it just meant that this highly experimental, developmental spacecraft had gone straight from its first test flights and into “routine” operations without a chance to fix all the things we learned about during the orbital flight tests. It was tragic that it took a fatal accident to give us the chance to make these upgrades. But the time was put to good use.
In addition to all the landing and deceleration changes that we were following in my group, we were also being made responsible for the new Launch Escape System—essentially pressure suits, parachutes, and an escape path (a jettisonable hatch and a slide pole to make sure you cleared the wing when bailing out). The escape system involved all new hardware, and that meant all new procedures and rules. There were also upgrades to the Auxiliary Power Units (APUs) and hydraulics that we were responsible for. This work followed and incorporated the upgrades that were made to the procedures and rules. To say that our group was any busier than any other Systems Division group would be to insult those others—everyone had things that were being worked. It was probably the busiest time I ever saw organization-wide in my years in the program.
In addition to all the system changes being worked, Mission Operations took the opportunity of the downtime to rework all the flight rules. They were reviewed one by one, and we made changes where we thought they were needed. The rationale behind every single one of them was documented. The Flight Rules book more than doubled in size as we documented the reasons behind all the rules. Everything was then incorporated into a single Rules and Rationales document that eventually grew too big for even a 6-inch, three-hole binder. In parallel with the rules reviews, we were also going through every procedure—nominal and off-nominal—with a fine-toothed comb to make sure there were no trap doors. We needed to be sure that all the physical changes to the systems had corresponding updates made to the operating and troubleshooting procedures.
In my own little piece of this world, I was not only supporting engineering changes to many of our mechanical systems, I was also learning a new flight control job: that of the Mechanical Maintenance Arm and Crew Systems Ascent/Entry flight controller. My previous experience in MCC (Mission Control Center) was spent operating and managing support systems for payloads, so I was experienced in the on-orbit environment and straddling the gap between the pure Orbiter operations and systems, and those of the payload world. My work started after Main Engine Cutoff. But being an airplane guy at heart, I was ready to soak up the dynamic phases of flight—ascent and entry—like a sponge. I spent day after day in the Control Center, either simming myself or watching others sim. Shuttle ascents were complex and happened very fast. There was no time for long discussions on
potential solutions to a problem. You had to know your response instantly—and it had to be right. I enjoyed the high-pressure ascent time frame, of course. But my real love was for entries—the act of flying a winged vehicle through the atmosphere to a wing-supported landing. I have always said that I love landing airplanes. I certainly love flying them up and away, but landings require the delicate finesse of bringing the bird back to an exact point on the surface with a gentle descent rate so that nothing is damaged. It makes no difference if we’re talking a Piper Cub or a Space Shuttle—landing requires skill and confidence in a pilot.
So I was quite happy to be thrown into the landing/deceleration world where we were constantly looking for better and safer ways to bring every Shuttle landing to a successful conclusion. At the time of the Challenger accident, there had been several instances of burned up brakes and even a blown tire or two. Everything in the landing world was slightly undersized because they were specified at a time when the Orbiter’s weight was supposed to be lower. All aircraft gain weight in the design process, and it is important to recognize this when sizing components. Because weight is such a critical factor in a space launch vehicle, the tires and brakes for the Shuttle never really grew as the Orbiter did, so they were very marginal. This was why we looked at adding a drag chute to get the vehicle out of the high-speed regime on the runway as quickly as possible. As a result, the brakes needed to remove less energy. Surprisingly enough, there was a drag chute in the early conceptual designs for the Shuttle, but it was deleted in a weight savings exercise sometime in the mid-1970s. Adding it back required us to do a great deal of design testing as well as procedure and rule development so that we knew how to use it to best advantage.