by James Reason
5.5. Zeebrugge
At 1805 on 6 March 1987, the ‘roll-on/roll-off’ passenger and freight ferry Herald of Free Enterprise, owned by Townsend Thoresen, sailed from the inner harbour at Zeebrugge en route to Dover with her bow doors open. As she passed the Outer Mole and increased speed, water came over the bow sill and flooded into the lower car deck (Deck G). At around 1827, the Herald capsized rapidly (in fewer than 2 minutes) and came to rest in shallow waters with her starboard side above the water. No fewer than 150 passengers and 38 crew lost their lives. Many others were injured. The chain of events and a limited inventory of the latent failures are shown in Case Study No. 5 (see Appendix). The single best source for more detailed information is the Department of Transport’s report on the formal investigation, conducted by Mr Justice Sheen, the Wreck Commissioner (published September 1987).
Mr Justice Sheen’s investigation was an interesting exception to the general tendency of postaccident inquiries to focus primarily upon active errors. It is worth quoting at some length from what he wrote about the management’s part in this catastrophe (Sheen, 1987, p. 14):
At first sight the faults which led to this disaster were the aforesaid errors of omission on the part of the Master, the Chief Officer and the assistant bosun, and also the failure by Captain Kirk to issue and enforce clear orders. But a full investigation into the circumstances of the disaster leads inexorably to the conclusion that the underlying or cardinal faults lay higher up in the Company. The Board of Directors did not appreciate their responsibility for the safe management of their ships. They did not apply their minds to the question: What orders should be given to the safety of our ships? The directors did not have any proper comprehension of what their duties were. There appears to have been a lack of thought about the way in which the HERALD ought to have been organised for the Dover/Zeebrugge run. All concerned in management, from the members of the Board of Directors down to the junior superintendents, were guilty of fault in that all must be regarded as sharing responsibility for the failure of management. From top to bottom the body corporate was infected with the disease of sloppiness.... The failure on the part of the shore management to give proper and clear directions was a contributory cause of the disaster.
5.6. King’s Cross
At 1925 on 18 November 1987, discarded smoker’s material probably set light to highly inflammable rubbish that had been allowed to accumulate in the running tracks of an escalator. Twenty minutes later, jets of flame shot up the escalator shaft and hit the ceiling of the ticket hall in which those evacuated via the Piccadilly and Victoria line escalators were gathering. Although a number of active failures were committed by the station staff and the emergency services in the intervening period, the primary causes of the disaster were present long before the start of the fire. These latent failures are summarised in Case Study No. 6 (see Appendix).
In the subsequent investigation (Fennell, 1988), the inspector placed the responsibility for the disaster squarely with the managements of London Regional Transport and its operating company, London Underground. Three quotations will serve to convey the flavour of his judgement.
The Chairman of London Regional Transport.... told me that where-as financial matters were strictly monitored, safety was not.... In my view, he was mistaken as to his responsibility. (Fennell, 1988, p. 17)
It is clear from what I heard that London Underground was struggling to shake off the rather blinkered approach which had characterised its earlier history and was in the middle of what the Chairman and Managing Director described as a change of culture and style. But in spite of that change the management remained of the view that fires were inevitable in the oldest most extensive underground system in the world. In my view they were fundamentally in error in their approach. (Fennell, 1988, p. 17)
I have devoted a chapter to the management of safety because the principal lesson to be learned from this tragedy is the right approach to safety. (Fennell, 1988, p. 18)
6. Distinguishing errors and violations
An important lesson to be learned from both the Chernobyl and Zeebrugge disasters is that the term ‘error’ does not capture all the ways in which human beings contribute to major accidents. An adequate framework for aberrant behaviours (literally ‘a straying from the path’) requires a distinction to be made between errors and violations. Both can be (and often are) present within the same action sequence, but they can also occur independently. One may err without committing a violation; a violation need not involve error.
Errors involve two distinct kinds of ‘straying’: the unwitting deviation of action from intention (slips and lapses) and the departure of planned actions from some satisfactory path towards a desired goal (mistakes). But this error classification, restricted as it is to individual information processing, offers only a partial account of the possible varieties of aberrant behaviour. What is missing is a further level of analysis acknowledging that, for the most part, humans do not plan and execute their actions in isolation, but within a regulated social milieu. While errors may be defined in relation to the cognitive processes of the individual, violations can only be described with regard to a social context in which behaviour is governed by operating procedures, codes of practice, rules and the like. For our purposes, violations can be defined as deliberate—but not necessarily reprehensible—deviations from those practices deemed necessary (by designers, managers and regulatory agencies) to maintain the safe operation of a potentially hazardous system.
The boundaries between errors and violations are by no means hard and fast, either conceptually or within a particular accident sequence. What is certain, however, is that dangerous aberrations cannot be studied exclusively within either the cognitive or the social psychological traditions; both need to be integrated within a single framework.
7. A preliminary classification of violations
7.1. The boundary categories
Violations may be committed for many reasons. One way of identifying the extremes of this range of possibilities is through the issue of intentionality. The first step is to ask: Was there a prior intention to commit this particular violation? If the answer is no, we can assign the violation to a category labelled erroneous or unintended violations. If the violation was deliberate, we need to know whether or not there was a prior intention to cause damage to the system. If there was, we can assign the violation to the general category of sabotage. Since the former category lies within the now well-defined province of error and the latter falls outside the scope of most accident scenarios, the violations of greatest interest are likely to be those occupying the middle ground, that is, violations having some degree of intentionality, but that do not involve the goal of system damage.
Within this broad hinterland of deliberate but nonmalevolent infringements, it is possible to make a further rough dichotomy between routine and exceptional violations. The former are largely habitual, forming an established part of an individual’s behavioural repertoire; the latter are singular violations occurring in a particular set of circumstances. The road environment provides multiple examples of routine violations. The behaviour of the Chernobyl operators in the 20 or so minutes before the explosions offers a clear instance of an exceptional set of violations.
7.2. Routine violations
Two factors, in particular, appear to be important in shaping habitual violations: (a) the natural human tendency to take the path of least effort; and (b) a relatively indifferent environment (i.e., one that rarely punishes violations or rewards observance). Everyday observation shows that if the quickest and most convenient path between two task-related points involves transgressing an apparently trivial and rarely sanctioned safety procedure, then it will be violated routinely by the operators of the system. Such a principle suggests that routine violations could be minimised by designing systems with human beings in mind at the outset. Landscape architects are forever making the mistake of laying out pathways to satisfy aesthetic criteria rather than hum
an needs; as a consequence, their symmetry is soon marred by muddy diagonal tracks across protected grassland.
7.3. Exceptional violations
Exceptional violations are not so clearly specified, being the product of a wide variety of local conditions. However, both the Chernobyl and the Zeebrugge disasters suggest the significance of what might loosely be called ‘system double-binds’—particular tasks or operating circumstances that make violations inevitable, no matter how well-intentioned the operators might be.
8. Psychological grounds for distinguishing errors and violations
One place where errors and violations are both abundant and relatively easy to observe is on the roads. In a recent study (Reason, Manstead, Stradling, Baxter, Campbell & Huyser, 1988), a Driver Behaviour Questionnaire (DBQ) was administered anonymously to 520 UK drivers of both sexes and covering a wide age range. The DBQ was made up of 50 items, each one describing either an error (a slip or a mistake) or a violation. The latter included both infringements of the Highway Code and deviations from accepted practice (e.g., driving too slowly on a two-lane rural highway). The respondents used a 5-point rating scale to indicate how frequently (over the past year) they had committed each type of ‘bad behaviour’.
The data were analysed using a factor analytic technique involving varimax rotation. Three orthogonal factors accounted for nearly 40 per cent of the variance. Items loading highly on factor 1 were violations (e.g., drinking and driving, close following, racing with other drivers, disregarding speed limits, shooting stop lights, etc.). Factors 2 and 3, however, were clearly associated with erroneous behaviour. The items loading highly on factor 2 tended to be hazardous errors: slips and mistakes that could have adverse consequences for other road users (e.g., failing to see ‘Give Way’ signs, failing to check mirror before manoeuvres, misjudging the speed of oncoming vehicles when overtaking, etc.). Items associated with factor 3, on the other hand, tended to be inconsequential lapses (e.g., taking the wrong exit at a roundabout, forgetting where one’s car is in a car park, driving to destination A when destination B was intended, etc.).
This analysis provided strong support for the belief that errors and violations are mediated by different cognitive mechanisms. This conclusion was further endorsed by the age and sex relationships. Violations declined with age, errors did not. Men at all ages reported more violations than women. Women were significantly more lapse-prone than men (or more honest!). These self-report data also correspond closely with what we know of the relative contributions of men and women at various ages to road accidents (Storie, 1977).
9. A resident pathogen metaphor
The case studies considered earlier, along with numerous others (Turner, 1978; Perrow, 1984), indicate that major disasters in defended systems are rarely if ever caused by any one factor, either mechanical or human. Rather, they arise from the unforeseen and usually unforeseeable concatenation of several diverse events, each one necessary but singly insufficient.
These observations suggest an analogy between the breakdown of complex technological systems and the aetiology of multiple-cause illnesses such as cancer and cardiovascular disease. More specifically, there appear to be similarities between latent failures in complex technological systems and resident pathogens in the human body.
The resident pathogen metaphor emphasises the significance of causal factors present in the system before an accident sequence actually begins. All man-made systems contain potentially destructive agencies, like the pathogens within the human body. At any one time, each complex system will have within it a certain number of latent failures, whose effects are not immediately apparent but that can serve both to promote unsafe acts and to weaken its defence mechanisms. For the most part, they are tolerated, detected and corrected, or kept in check by protective measures (the auto-immune system). But every now and again, a set of external circumstances—called here local triggers—arises that combines with these resident pathogens in subtle and often unlikely ways to thwart the system’s defences and to bring about its catastrophic breakdown.
In medicine, a good deal more is known about the nature of active failures (i.e., trauma, invasive agencies, acute diseases, etc.) than about the action of resident pathogens. The same is true in the systems reliability field; single component failures or simple human errors can be foreseen and contained by built-in safety devices, but these engineered defences offer little protection against certain combinations of system pathogens and local triggers. In addition, there are interesting parallels between the aetiologies of pathogen-related diseases and the catastrophic breakdown of complex, opaque technical installations. Both seem to require the breaching of defences by a concatenation of resident pathogens and external triggering events, though in both cases the precise nature of this interaction is hard to predict.
The resident pathogen notion directs attention to the indicators of ‘system morbidity’ that are present prior to a catastrophic breakdown. These, in principle, are more open to detection than the often bizarre and unforeseeable nature of the local triggering events. Implicit in the metaphor is the notion that the likelihood of an accident will be some function of the number of pathogens currently present within the sociotechnical system. The greater the number of pathogens residing in a system, the more likely it will encounter just that particular combination of triggering conditions sufficient to complete an accident sequence.
Other things being equal, the more complex, interactive, tightly-coupled and opaque the system, the greater the number of resident pathogens it is likely to contain. However, while simpler systems are usually less interactive, less centralised and more transparent, they tend to be considerably less evolved with regard to built-in defences. Thus, relatively few pathogens can often wreak greater havoc in simpler systems than in more advanced ones.
An important corollary of these arguments is that the risk of an accident will be diminished if these pathogens are detected and neutralized proactively. However, like cancer and heart disease, accidents have multiple causes. The occurrence of an accident is not simply determined by the sheer number of pathogens in the system; their adverse effects have to find windows of opportunity to pass through the various levels of the system and, most particularly, through the defences themselves. In short, there are a large number of stochastic factors involved.
The resident pathogen metaphor has a number of attractive features, but it is far from being a workable theory. Its terms are still unacceptably vague. Moreover, it shares a number of features with the now largely discredited accident proneness theory, though it operates at a systemic rather than at an individual level.
Accident proneness theory had two elements. First, the purely statistical observation that certain people have more than their chance share of accidents, as determined by the Poisson model. Second, and much more controversial, there was the assumption that this unequal liability originated in some relatively enduring feature of the individual (i.e., personality traits, information-processing deficiencies, physical characteristics and the like). In the pathogen metaphor, comparable assertions are being made about systems rather than individuals. Here it is argued that some systems have a greater accident liability due to their larger accumulation of resident pathogens. The major difference, of course, lies in their respective remedial implications. Accident proneness theory, predicated as it is upon stable dispositional factors, offers no alternative other than the screening out of high-liability individuals; pathogen theory leads to a search for preaccident morbidity indicators and assumes that these are remediable.
Accident proneness theory failed because it was found that unequal accident liability was, in reality, a ‘club’ with a rapidly changing membership. In addition, attempts to find a clearly definable accident-prone personality proved fruitless.
The pathogen metaphor would suffer a similar fate if it were found that pathogens could only be identified retrospectively in relation to a specific set of accident circumstances in a particular
system. For the pathogen metaphor to have any value, it is necessary to establish an a priori set of indicators relating to system morbidity and then to demonstrate clear causal connections between these indicators and accident liability across a wide range of complex systems and in a variety of accident conditions.
10. A general view of accident causation in complex systems
This section seeks to extend the pathogen metaphor in order to lay the foundations of a possible theoretical framework for considering the aetiology of accidents in complex technological systems. As indicated earlier, the challenge for such a framework is not just to provide an account of how latent and active failures combine to produce accidents, but also to indicate where and how more effective remedial measures might be applied. The framework has as its building blocks the basic elements of production common to any complex system (Wreathall, 1989).
10.1. The basic elements of production
The notion of production offers a reasonably uncontroversial starting point. All complex technologies are involved in some form of production. The product can be energy, a chemical substance or the mass transportation of people by road, rail, sea or air.
Figure 7.4 identifies the basic elements common to all such productive systems. These elements are represented diagrammatically as planes, one behind the other. We can think of these planes as identifying the essential, benign components of effective production.