Book Read Free

Humble Pi

Page 18

by Matt Parker


  Sometimes your cheese holes just line up.

  The Swiss Cheese model looks at how ‘defenses, barriers, and safeguards may be penetrated by an accident trajectory’. This accident trajectory imagines accidents as similar to a barrage of stones being thrown at a system: only the ones which make it all the way through result in a disaster. Within the system are multiple layers, each with their own defences and safeguards to slow mistakes. But each layer has holes. They are like slices of Swiss cheese.

  I love this view of accident management because it acknowledges that people will inevitably make mistakes a certain percentage of the time. The pragmatic approach is to acknowledge this and build a system robust enough to filter mistakes out before they become disasters. When a disaster occurs, it is a system-wide failure and it may not be fair to find a single human to take the blame.

  As an armchair expert, it seems that the disciplines of engineering and aviation are pretty good at this. When researching this book, I read a lot of accident reports and they were generally good at looking at the whole system. It is my uninformed impression that in some industries, such as medicine and finance, which do tend to blame the individual, ignoring the whole system can lead to a culture of not admitting mistakes when they happen. Which, ironically, makes the system less able to deal with them.

  But much like actual Swiss cheese,fn1 sometimes all the holes do randomly line up. Unlikely events happen occasionally. Which is what happened with the flight BA5390 disaster. All of these things had to go wrong for the window to explode outwards:

  >> Sam chose the wrong bolts

  The main store did not have enough of the part Sam needed. If the carousel had been restocked properly, he could have grabbed the 7D bolts he was after and just got on with it.

  The unstaffed store was disorganized. In the investigation it was discovered that, of the 294 drawers which contained stock, 25 were missing labels and, of the 269 which did have labels, only 163 contained only the correct parts.

  The store was poorly lit and Sam did not have his glasses to notice he had picked the wrong bolts.

  >> Sam did not notice that the bolts did not fit properly

  He would have felt the bolt thread slip when it went into the locking nuts. Except this slipping felt the same as the torque screwdriver kicking in when the required torque had been reached.

  The 8C bolts Sam used had a smaller head than the 7D ones he had taken out, and this looked obvious because they did not fill the recessed dip made for the bolt heads. Except the two-hand method he had to use to keep the screwdriver together obscured his view.

  >> No one checked Sam’s work

  If Sam had been anyone other than the shift maintenance manager, his work would have been checked by the, well, shift maintenance manager.

  The windscreen, amazingly, was not classified as a ‘vital point’ of catastrophic failure, and only vital points definitely had to be double-checked, even if performed by the shift maintenance manager.

  >> The windscreen could explode outwards

  Aircraft parts are often designed according to the plug principle, which is a form of passive failsafe. If the windscreen had been fitted from the inside, the air pressure from within the cabin would help hold it in place. Because the windscreen was fitted on the outside, the bolts were fighting against the internal cabin pressure.fn2

  I can think of other things that would have stopped the disaster. The British Standard for A211 bolts could require a marking on the bolt itself, instead of just on the packet. The British Airways maintenance documentation could have been more explicit about the complexity of the task. The Civil Aviation Authority could require a pressure test after work is done on the pressure hull. The list goes on.

  The subtle effect here is that, while each of these individual steps may be fairly likely, the probability that they all happen simultaneously is very small. There will always be a few mistakes that make it through a few layers of cheese, but very rarely do enough holes line up to let mistakes become a disaster.

  It is not a comforting thought that minor mistakes and unfortunate circumstances pop up all the time in aviation and we are saved only by later things which happen to go right and neutralize the threat. But, statistically, that is the case and, statistically, we are extremely safe. We can believe in cheeses.

  Anyone who already has a fear of flying had better stop reading now and skip ahead to the next section. Don’t worry: you’ll not miss anything.

  For everyone else: here is an insight into how minor mistakes can take place with no ramifications. Remember those A211-7D bolts Sam removed from the original windscreen? That window in a BAC 1-11 jet airliner should have used A211-8D bolts. They were already wrong. When BA acquired that airliner, it came with the wrong bolts already fitted. It had been flying with the wrong bolts for years.

  During the investigation they found eighty of the old bolts that Sam had removed: seventy-eight were the incorrect 7Ds and only two were 8Ds. The aircraft had been flying with windscreen bolts which were slightly too short. Thankfully, the bolts had been selected to be long enough for the six spots in the thickest part of the window, and slightly too long in the other eighty-four places. The shorter 7D bolts were still long enough to keep most of the window firmly fixed in place.

  Ironically, the 8C bolts Sam grabbed by accident were the correct length. But they were skinnier and did not lock properly with the nuts: with enough force, they could be ripped out – as happened in this near-fatal disaster. If things had gone slightly differently (the windscreen failing at a higher altitude; the co-pilot not regaining control of the aircraft) that 0.66-millimetre difference in diameter could easily have resulted in the deaths of all eighty-seven people on board.

  Straight after the accident and before the investigation had been completed, BA did an emergency check of all its BAC 1-11s, removing every fourth windscreen bolt and measuring it. Two more aircraft were grounded because they were found to have the wrong bolts. A separate airline did a similar check and found that two of its aircraft were also using the wrong bolts.

  It’s terrifying.

  If humans are going to continue to engineer things beyond what we can perceive, then we need to also use the same intelligence to build systems that allow them to be used and maintained by actual humans. Or, to put it another way, if the bolts are too similar to tell apart, write the product number on them.

  TEN

  Units, Conventions, And Why Can’t We All Just Get Along?

  A number without units can be meaningless. If something costs ‘9.97’ you want to know what currency that price is listed in. If you’re expecting British pounds or American dollars and it ends up being Indonesian rupiah or bitcoin, you’re in for a surprise (and a very different surprise, depending on which of those two it is). I run a UK-based retail website and we had a complaint from a customer for our audacity in listing prices in a ‘foreign currency’.

  So the charge amount listed was foreign currency? Obviously, and probably for a good number of us ordering, we would be expecting a US dollar quote.

  –Unsatisfied mathsgear.co.uk customer

  Getting the units wrong can drastically change the meaning of a number, so there are all sorts of fantastic examples of such mistakes. Famously, Christopher Columbus used Italian miles (1 Italian mile = 1,477.5 metres) when reading distances written in Arab miles (1 Arab mile = 1,975.5 metres) and so estimated that Asia was only a leisurely sail away from Spain. His unit mistake, combined with some other faulty assumptions, meant that Columbus was expecting his destination port in China to be roughly where modern-day San Diego is. The actual distance from Europe to Asia would have been too far for Columbus to traverse, were it not for an unexpected land mass he hit instead. Although there is some speculation that he got the numbers wilfully wrong to deceive his sponsors and crew.

  When I was researching and writing this book the most common question from people I spoke to was ‘Will you talk about the NASA spacecraft which used
the wrong units and crashed into Mars?’ (The second most common was Londoners asking about The Wobbly Bridge.) There is something about a units error that people love. Maybe because it is such a familiar mistake. Combined with the schadenfreude of NASA making a basic maths error, it makes for an enticing story.

  And this is a case where the urban legend is (almost completely) true. In December 1998 NASA launched the Mars Climate Orbiter spacecraft, which then took nine months to travel from Earth to Mars. Once it arrived at Mars, a mismatch of metric and imperial unitsfn1 caused a complete mission failure and the loss of the spacecraft.

  Spacecraft use flywheels, which are basically massive spinning tops, for stability and control. The gyroscopic effect means that, even in the friction-free vacuum of space, the craft can effectively push against something and move itself around. But, over time, the flywheels can end up spinning too fast. To fix this, an angular momentum desaturation (AMD) event is performed to spin them down, using thrusters to keep the spacecraft stable, but this does cause a slight change in the overall trajectory. A slight but significant change.

  Whenever the thrusters are used, data is beamed back to NASA about exactly how powerful the bursts were and how long they lasted. A piece of software called SM_FORCES (for ‘small forces’) was developed by Lockheed Martin to analyse the thruster data and feed it to an AMD file for use by the NASA navigation team.

  This is where the problem occurred. The SM_FORCES program was calculating the forces in pounds (technically, pound-force: the gravitational force on one pound of mass on the Earth), whereas the AMD file was assuming the numbers it received were in Newtons (the metric unit of force). One pound of force is equal to 4.44822 Newtons, so, when SM_FORCES reported in pounds, the AMD file thought the figures were the smaller unit of Newtons and underestimated the force by a factor of 4.44822.

  The Mars Climate Orbital crashed not because of one big miscalculation when it arrived at Mars but because of many little ones over the course of its nine-month journey. When it was ready to go into orbit around Mars, the NASA navigation team thought it had been moved off course only slightly by all the angular momentum desaturation events. They expected it to glance past Mars at a distance of 150 to 170 kilometres from the surface, which would clip the atmosphere just enough to start to slow the spacecraft down and bring it into orbit. Instead, it was heading directly for an altitude of just 57 kilometres above the Martian surface, where it was destroyed in the atmosphere.

  Missed it by that much.

  All it takes is one units mismatch to destroy hundreds of millions of dollars of spacecraft. For the record, the NASA ‘software interface specification’ had specified that the units should all be metric; the SM_FORCES was not made in accordance with the official specifications. So it was actually NASA using metric units, and the contractor being old-school, that caused the problem.

  The problem that brought down a modern spaceship also sank a seventeenth-century warship. On 10 August 1628 the Swedish warship Vasa was launched and sank within minutes. For those brief moments it was the most powerfully armed warship in the world: fully loaded with sixty-four bronze cannons. Unfortunately, it was also rather top-heavy. Those cannons did not help, and nor did the heavily reinforced top decks required to hold them. All it took was two strong gusts of wind and the ship toppled over, sinking with the loss of thirty lives.

  Fortunately for history, the Vasa sank in waters that were ideal for preserving wood. Shortly after it sank, most of the precious bronze cannons were salvaged and the rest of the wreck was left and forgotten – until 1956, when wreck researcher Anders Franzén managed to locate the Vasa once more. By 1961 it had been raised from the water, and it now lives in a custom-built museum in Stockholm. Despite having spent three centuries lying on the bottom of the ocean, the Vasa is incredibly well preserved. It’s missing its cannons and original paint job but, otherwise, it looks eerily new.

  Modern analysis of the structure of the Vasa’s hull has shown that it is asymmetric, more so than other ships of the same era. So, while the overloading of the top of the ship was definitely a large factor in its lack of stability, an underlying mismatch of the port and starboard sides was also to blame.

  They like big hulls and they cannot lie (level).

  During the restorations, four different rulers were recovered. Two were ‘Swedish feet’ rulers split into twelve inches, and the other two were ‘Amsterdam feet’ rulers, split into only eleven inches. Amsterdam inches were bigger than Swedish inches (and the feet were slightly different lengths too). Archaeologists working on the Vasa have speculated that this may have caused the asymmetry. If the teams of builders working on the ship were using subtly different inches but were following the same instructions, this would have produced parts of different sizes. In this case, we don’t know what the ‘wood interface specification’ required.

  Not long after the June 2017 UK election a Google search for ‘how long has theresa may been pm’ gave her height in picometres.

  When it comes to measuring a leader’s body parts, trillionths of a metre is never the most convenient case. Except maybe for Trump.

  If you can’t handle the heat, get out of the conversion

  At least units of distance can agree on where to start. With length, there is a very obvious zero point: when you have nothing of something. Metres and feet might argue about the size of intervals, but they all start at the same place. With temperature, this is not so obvious. There is no clear place to start a temperature scale, as it could always be colder (within human experience).

  Two of the most popular temperature scales are Fahrenheit and Celsius, and each took a different approach to choosing a starting zero-point temperature. German physicist Daniel Fahrenheit proposed the scale which bears his name in 1724, and the zero-point was based on a frigorific mixture. If ‘frigorific’ has not instantly become your new favourite word, you’re cold and dead inside.

  A frigorific mixture is a pile of chemical substances which will always stabilize to the same temperature, so they make for a good reference point. In this case, if you give ammonium chloride, water and ice a good stir, they will end up at 0°F. If you mix just water and ice, it will be 32°F, and the far less frigorific mixture of human blood (while still inside a healthy human) is 96°F. While these were Fahrenheit’s original reference points, the modern Fahrenheit scale has since been adjusted and is now pinned to water freezing at 32°F and boiling at 212°F. Frigorific!

  The Celsius scale began around the same time, with Swedish astronomer Anders Celsius, except he counted the wrong way. Celsius started with zero as the boiling point of water at normal atmospheric pressure then counted up as the temperature went down, with water eventually freezing at 100°C. Meanwhile, other people went with the more popular convention of starting with zero at water’s freezing point and counting up to a hundred as its boiling point, and then they all argued over who had the idea first. There was no clear winner as to whose idea it was, but the unit itself caught on and was give the neutral name Centigrade.

  Celsius had the last laugh, however, when the name Centigrade clashed with a unit for measuring angles (a centigrade, or gradian, is one four-hundredth of a circle) so, in 1948, it was named after him after all. Celsius is now used almost universally to measure temperature, except seemingly in a few countries which still use Fahrenheit, like Belize, Myanmar, the US and a decent section of the English population who are ‘too old to change now’ (even though the UK has tried to be metric for about half a century). This means there is still some need to convert between the two scales, and temperature is not as easy as length.

  Measuring distances may involve different-sized units, but all systems have the same starting point, which means there is no difference when you go between absolute measurements and relative differences. If someone is 0.5 metres taller than me and 10 metres away, both these measurements can be converted into feet in the same way (multiplying by 3.28084); it does not matter that 10 metres is an absolut
e measurement and 0.5 metres is the difference between two measurements (our heights). It all seems so natural. But it doesn’t work with temperatures.

  In September 2016 the BBC news reported that both the US and China had signed up to the Paris Agreement on climate change, summarizing the agreement like this: ‘countries agreed to cut emissions enough to keep the global average rise in temperatures below 2°C (36°F)’. The mistake here is not just that the BBC is still giving temperatures in Fahrenheit but that a change of 2°C is not the same as a change of 36°F, even though a temperature of 2°C is the same as 36°F. If you were outside on a day when the temperature was 2°C and you looked at a Fahrenheit thermometer, it would indeed read 36°F. But if the temperature then increased by 2°C, it would only go up by 3.6°F.

  The crazy thing is, the BBC initially got it correct. Thanks to the amazing website newssniffer.co.uk, which automatically tracks all changes in online news articles, we can see the chaos in the BBC newsroom as a series of numerical edits.

  To be fair, the article was part of the live coverage of breaking news and was designed to be regularly updated. The first version of the article that mentioned temperature gave the change as 2°C. But there must have been some chat about the complaints they were likely to get if they didn’t add Fahrenheit, so about two hours later 3.6°F was added. Which is the correct answer!

  But this is an unstable correct answer because, even though it is right, there is a ‘more obvious’ but less correct answer that people will try to change it to (like someone crossing out ‘octopuses’ and replacing it with ‘octopi’). And about half an hour later 3.6°F disappeared and 36°F popped up in its place. A temperature of 2°C in absolute terms is 35.6°F, so someone must have seen 3.6°F and figured it was a rounded 35.6°F with the decimal point in the wrong place. I can only imagine the heated debates between the 3.6°F and 36°F factions as each tried to claim they were the holders of the ultimate temperature truth, until, in my mind, a frazzled editor shouted, ‘Enough! Now no one gets a temperature!’ At 8 a.m., three hours after 36°F appeared, it disappeared without replacement. It seems 2°C was enough. The BBC had given up on giving a Fahrenheit conversion.

 

‹ Prev