by James Reason
The main thrust of this view of accident causation is towards the establishment of loops 3 and 4: pathogen auditing. The theory dictates that the most effective way of managing safety is by acting upon types rather than tokens, that is, by influencing system states occurring early on in the accident sequence. To identify these failure types and to find ways of neutralizing the pathogens so revealed represent the major challenges facing human factors researchers concerned with preserving the safety of complex, high-risk systems.
10.4. General indicators
The general indicators shown in Figure 7.9 cover two broad aspects of an organisation’s safety functioning. The first relates to the variety and sensitivity of its feedback loops. The second deals with the decision makers’ responsiveness to safety-related data. No amount of feedback will enhance an organisation’s degree of safety if the information supplied is not acted upon in a timely and effective manner.
Westrum (1988) has provided a simple but meaningful classification of the ways in which organisations may differ in their reactions to safety data. His basic premise is that: “Organizations think, like individuals, they exhibit a consciousness, a memory, and an ability to create and to solve problems. Their thinking strongly affects the generation and elimination of hazards.” Organisational responses to hazards fall into three groups: denial, repair and reform actions.
Denial Actions
Suppression: Observers are punished or dismissed, and the observations expunged from the record.
Encapsulation: Observers are retained, but the validity of their observations is disputed or denied.
Repair Actions
Public Relations: Observations emerge publicly, but their significance is denied; they are sugar-coated.
Local Repairs: The problem is admitted and fixed at the local level, but its wider implications are denied.
Reform Actions
Dissemination: The problem is admitted to be global, and global action is taken upon it.
Reorganisation: Action on the problem leads to reconsideration and reform of the operational system.
The more effective the organisation, the more likely it is to respond to safety data with actions from the bottom of this list (i.e., reform), while those less adequate will employ responses from the top (i.e., denial).
Westrum then uses these reactions to define organisations along a scale of what he calls ‘cognitive adequacy’, or the effectiveness of their ways of thinking about hazard. These are grouped under three headings: pathological, calculative and generative organisations.
(a) Pathological organisations are ones whose safety measures are inadequate even under normal conditions. These organisations sacrifice safety goals in the pursuit of production goals, often under severe economic pressures, and actively circumvent safety regulations. Information about hazardous conditions is suppressed at the source by suppressing or encapsulating the messenger (e.g., the Tennessee Valley Authority’s nuclear power plant management).
(b) Calculative organisations try to do the best job they can using ‘by-the-book’ methods. These are usually adequate under normal operating conditions, but often fail when they encounter unforeseen circumstances. In short, they may implement many safety practices but have little in the way of effective disaster plans (e.g., the New Jersey chemical industry, the CEGB, the U.K. Department of Energy).
(c) Generative organisations are characterised by a high degree of ostensibly irregular or unconventional activity in furthering their goals. They set targets for themselves beyond ordinary expectations and fulfill them because they are willing to do unexpected things in unexpected ways. They emphasise results rather than methods, and value substance more than form. Hazards tend to be quickly discovered and neutralised because lower-level personnel have both permission to see and permission to do (e.g., U.S. nuclear aircraft carrier flight deck operations—see Rochlin, LaPorte & Roberts., 1987).
11. Learning the right lessons from past accidents
It is not easy to learn the right lessons from past disasters, especially if these events are likely to further undermine public confidence in the safety of one’s own similar technologies. Institutional reactions to other people’s catastrophes reveal, among other things, two universal human failings: the fundamental attribution error and the fundamental surprise error.
The fundamental attribution error has been widely studied in social psychology (see Fiske & Taylor, 1984; see also Chapter 2, Section 3.5). This refers to a pervasive tendency to blame bad outcomes on an actor’s personal inadequacies (i.e., dispositional factors) rather than attribute them to situational factors beyond his or her control. Such tendencies were evident in both the Russian and the British responses to the Chernobyl accident. Thus, the Russian report on Chernobyl (USSR State Committee on the Utilization of Atomic Energy, 1986) concluded that: “The prime cause of the accident was an extremely improbable combination of violations of instructions and operating rules.” Lord Marshall, Chairman of the U.K. Central Electricity Generating Board (CEGB), wrote a foreword to the U.K. Atomic Energy Authority’s report upon the Chernobyl accident (UKAEA, 1987), in which he assigned blame in very definite terms: “To us in the West, the sequence of reactor operator errors is incomprehensible. Perhaps it came from supreme arrogance or complete ignorance. More plausibly, we can speculate that the operators as a matter of habit had broken rules many, many times and got away with it so the safety rules no longer seemed relevant.” Could it happen in the U.K.? “My own judgement is that the overriding importance of ensuring safety is so deeply engrained in the culture of the nuclear industry that this will not happen in the U.K.”
The term fundamental surprise was coined by an Israeli social scientist, Zvi Lanir (Lanir, 1986), in regard to the Yom Kippur War, but it is particularly apt for both the TMI-2 and Chernobyl accidents. A fundamental surprise reveals a profound discrepancy between one’s perception of the world and the reality. A major reappraisal is demanded. Situational surprises, on the other hand, are localised events requiring the solution of specific problems.
Lanir likens the difference between situational and fundamental surprise to that between ‘surprise’ and ‘astonishment’ and illustrates it with an anecdote from Webster, the lexicographer. One day, Webster returned home to find his wife in the arms of his butler. “You surprised me,” said his wife. “And you have astonished me,” responded Webster. Mrs Webster experienced merely a situational surprise; Mr Webster suffered a fundamental one.
The natural human tendency is to respond to fundamental surprises as if they were only situational ones. Thus, the fundamental surprise error “is to avoid any fundamental meaning and to learn the situational lessons from the surface events” (Lanir, 1986).
At the Sizewell B public inquiry (Layfield, 1987), the CEGB witnesses sought to dissociate the future station from the troubles that beset the Metropolitan Edison pressurised water reactor (PWR) on Three Mile Island on 28 March 1979. They identified various salient features of the TMI-2 accident— the steam power plant, sticky relief valves, poor control room design, inadequate operator training, etc.—and asserted that these and more would be significantly improved in the Sizewell B PWR. In his assessment of this evidence, Sir Frank Layfield hinted at broader issues: “Some aspects of the TMI accident give warnings which are of general importance,” but then concluded that they were not applicable in the U.K. due to organisational differences.
This natural urge to distance U.K. installations from foreign catastrophes was even more apparent in the UKAEA’s (1987) analysis of the Chernobyl disaster that concluded: “the Chernobyl accident was unique to the [Russian] reactor design and there are few lessons for the United Kingdom to learn from it. Its main effect has been to reinforce and reiterate the importance and validity of existing UK standards.”
So what are the right lessons to be learned from TMI and Chernobyl? These, I believe, have been well stated for TMI-2 by David Woods, previously of the Westinghouse Corporation (Woods, 1987)
. The same general conclusions apply equally to Chernobyl and to the Bhopal, Challenger, and Zeebrugge accidents (Woods, 1987):
The TMI accident was more than an unexpected progression of faults; it was more than a situation planned for but handled inadequately; it was more than a situation whose plan had proved inadequate. The TMI accident constituted a fundamental surprise in that it revealed a basic incompatibility between the nuclear industry’s view of itself and reality. Prior to TMI the industry could and did think of nuclear power as a purely technical system where all the problems were in the form of some technical area or areas and the solutions to these problems lay in those engineering disciplines. TMI graphically revealed the inadequacy of that view because the failures were in the socio-technical system and not due to pure technical nor pure human factors.
Regardless of the technology or the country that it serves, the message of this chapter is very clear: No one holds the monopoly on latent failures. And these resident pathogens constitute the primary residual risk to complex, highly-defended technological systems.
12. Postscript: On being wise after the event
This chapter has argued that most of the root causes of serious accidents in complex technologies are present within the system long before an obvious accident sequence can be identified. In theory, at least, some of these latent failures could have been spotted and corrected by those managing, maintaining and operating the system in question. In addition, there were also prior warnings of likely catastrophe for most of the accidents considered here. The Rogovin inquiry (Rogovin, 1979), for example, discovered that the TMI accident had ‘almost happened’ twice before, once in Switzerland in 1974 and once at the Davis-Besse plant in Ohio in 1977. Similarly, an Indian journalist wrote a prescient series of articles about the Bhopal plant and its potential dangers three years before the tragedy (see Marcus & Fox, 1988). Other unheeded warnings were also available prior to the Challenger, Zeebrugge and King’s Cross accidents.
For those who pick over the bones of other people’s disasters, it often seems incredible that these warnings and human failures, seemingly so obvious in retrospect, should have gone unnoticed at the time. Being blessed with both uninvolvement and hindsight, it is a great temptation for retrospective observers to slip into a censorious frame of mind and to wonder at how these people could have been so blind, stupid, arrogant, ignorant or reckless.
One purpose of this concluding section is to caution strongly against adopting such a judgemental stance. No less than the accident-producing errors themselves, the apparent clarity of retrospection springs in part from the shortcomings of human cognition. The perceptual biases and strong-but-wrong beliefs that make incipient disasters so hard to detect by those on the spot also make it difficult for accident analysts to be truly wise after the event. Unless we appreciate the potency of these retroactive distortions, we will never truly understand the realities of the past, nor learn the appropriate remedial lessons.
There is one obvious but psychologically significant difference between ourselves, the retrospective judges, and the people whose decisions, actions or inactions led to a disaster; we know how things were going to turn out, they did not. As Baruch Fischhoff and his colleagues have shown, possession of outcome knowledge profoundly influences the way we survey past events (Fischhoff, 1975; Slovic & Fischhoff, 1977; Fischhoff, 1989). This phenomenon is called hindsight bias, and has two aspects:
(a) The ‘knew-it-all-along’ effect (or ‘creeping determinism’), whereby observers of past events exaggerate what other people should have been able to anticipate in foresight. If they were involved in these events, they tend to exaggerate what they themselves actually knew in foresight.
(b) Historical judges are largery unaware of the degree to which outcome knowledge influences their perceptions of the past. As a result, they overestimate what they would have known had they not possessed this knowledge.
The historian George Florovsky described this phenomenon very precisely: “In retrospect, we seem to perceive the logic of the events which unfold themselves in a regular or linear fashion according to a recognizable pattern with an alleged inner necessity. So that we get the impression that it really could not have happened otherwise” (quoted by Fischhoff, 1975, p. 288).
Outcome knowledge dominates our perceptions of the past, yet we remain largely unaware of its influence. For those striving to make sense of complex historical events, familiarity with how things turned out imposes a definite but unconscious structure upon the antecedent actions and conditions. Prior facts are assimilated into this schema to make a coherent causal story, a process similar to that observed by Bartlett (1932) in his studies of remembering. But to those involved at the time these same events would have had no such deterministic logic. Each participant’s view of the future would have been bounded by local concerns. Instead of one grand convergent narrative, there would have been a multitude of individual stories running on in parallel towards the expected attainment of various distinct and personal goals.
Before judging too harshly the human failings that concatenate to cause a disaster, we need to make a clear distinction between the way the precursors appear now, given knowledge of the unhappy outcome, and the way they seemed at the time. Wagenaar and Groeneweg (1988) have coined the term impossible accident to convey the extreme difficulty of those involved to foresee any possible adverse conjunction between what seemed then to be unconnected and, in many instances, not especially unusual or dangerous happenings. They concluded their review of 100 shipping accidents with the following comment (Wagenaar & Groeneweg, 1988, p. 42):
Accidents appear to be the result of highly complex coincidences which could rarely be foreseen by the people involved. The unpredictability is caused by the large number of causes and by the spread of the information over the participants.... Accidents do not occur because people gamble and lose, they occur because people do not believe that the accident that is about to occur is at all possible.
The idea of personal responsibility is deeply rooted in Western cultures (Turner, 1978). The occurrence of a man-made disaster leads inevitably to a search for human culprits. Given the ease with which the contributing human failures can subsequently be identified, such scapegoats are not hard to find. But before we rush to judgement, there are some important points to be kept in mind. First, most of the people involved in serious accidents are neither stupid nor reckless, though they may well have been blind to the consequences of their actions. Second, we must beware of falling prey to the fundamental attribution error (i.e., blaming people and ignoring situational factors). As Perrow (1984) argued, it is in the nature of complex, tightly-coupled systems to suffer unforeseeable sociotechnical breakdowns. Third, before beholding the mote in his brother’s eye, the retrospective observer should be aware of the beam of hindsight bias in his own.
8 Assessing and reducing the human error risk
* * *
This book began by discussing the nature of human error and the theoretical influences that have shaped its study. It then proposed distinctions between error types based on performance levels and error forms derived from basic memory retrieval mechanisms. Chapters 4 and 5 presented a framework theory of error production, and Chapter 7 considered the various processes by which errors are detected. The preceding chapter examined some of the consequences of human error in high-risk technologies, looking in particular at the effects of latent failures.
It is clear from this summary that the bulk of the book has favoured theory rather than practice. This concluding chapter seeks to redress the balance somewhat by focusing upon remedial possibilities. It reviews both what has been done and what might be done to minimise the often terrible costs of human failures in potentially hazardous environments. More specifically, it deals with the various techniques employed or proposed by human reliability specialists to assess and to reduce the risks associated with human error.
The chapter has been written with two kinds of reader in mind: p
sychologists who are unfamiliar with the methods of human reliability analysis (not unsurprisingly since most of this material is published outside the conventional psychological literature), and safety practitioners of one sort or another. For the sake of the former, I have tried to make clear the model and assumptions underlying each technique. This means that relatively little space is left to tell the practitioners exactly how these methods should be applied in specific contexts. To compensate for this, I will indicate (where they are available and/or appropriate) where more detailed procedural information can be obtained. For the benefit of both types of reader, however, I will state briefly what is known of the reliability and validity of these methods, where such data exist.
The development of human reliability analysis (HRA) techniques has been intimately bound up with the fortunes and misfortunes of the nuclear power industry. This does not mean that such methods are applicable only to the design and operation of nuclear power plants—they have been pioneered and widely used in other industries and organizations—but it is certainly true that nuclear power generation has been the focus of most human reliability developments over the past two decades. Since nuclear power applications will feature so extensively here, it is worth dwelling briefly on why this has been the case.
The first reason has to do with the fears—neither entirely baseless nor altogether rational—excited by anything nuclear, particularly in Europe. This public concern over the safety of nuclear power generation was considerably heightened by the Chernobyl disaster. In June 1988, the industry’s technical magazine, Nuclear Engineering International, reported the results of its annual world survey, which showed that 10 countries, mostly in Europe, had postponed or cancelled reactor orders. This occurred even in countries like Finland and Belgium, where the good operating record of their existing stations had previously created a climate favourable to the development of nuclear power.