Human Error

Home > Other > Human Error > Page 29
Human Error Page 29

by James Reason


  Figure 7.4. The basic elements of production. These constitute the necessary and benign components of any productive system.

  10.1.1. The decision makers

  These include both the architects and the high-level managers of the system. Once in operation, they set the goals for the system as a whole in response to inputs from the outside world. They also direct, at a strategic level, the means by which these goals should be met. A large part of their function is concerned with the allocation of finite resources. These comprise money, equipment, people (talent and expertise) and time. Their aim is to deploy these resources to maximise both productivity and safety.

  10.1.2. Line management

  These are the departmental specialists who implement the strategies of the decision makers within their particular spheres of operation. They go by various labels: operations, training, sales, maintenance, finance, procurement, safety, engineering support, personnel and so on.

  10.1.3. Preconditions

  Appropriate decisions and effective line management are clearly prerequisites for successful production. But they are not of themselves sufficient. We need something between the line managers and the productive activities. These are a set of qualities possessed by both machines and people: reliable equipment of the right kind; a skilled and knowledgeable workforce; an appropriate set of attitudes and motivators; work schedules, maintenance programmes and environmental conditions that permit efficient and safe operations; and codes of practice that give clear guidance regarding desirable (safe and/or efficient) and undesirable (unsafe and/or inefficient) performance —to name but a few.

  10.1.4. Productive activities

  These are the actual performances of humans and machines: the precise synchronisation of mechanical and human activities in order to deliver the right product at the right time.

  10.1.5. Defences

  Where productive activities involve exposure to natural or intrinsic hazards, both individuals and machines should be supplied with safeguards sufficient to prevent foreseeable injury, damage or costly outages.

  10.2. The human elements of accident causation

  These are represented in Figure 7.5. It should be noted that a parallel diagram could equally well have been drawn for the purely mechanical or technical failures. However, our principal concern is with the human contribution to systems accidents, because accident analyses reveal that human factors dominate the risks to complex installations. Even what appear at first sight to be simple equipment breakdowns can usually be traced to some prior human failure. Nevertheless, it is important to acknowledge that any component or piece of equipment has a limited reliable life; all such items may fail for engineering rather than human reasons.

  The essence of Figure 7.5 is that it portrays these human contributions as weaknesses or ‘windows’ in the basic productive ‘planes’ (shown in Figure 7.4). Here we show the dark side of the production picture. The causal sequence moves from fallible decisions, through the intervening planes to an accident, that is, the unplanned and uncontrolled release of some destructive force, usually in the presence of victims. In what follows, I will elaborate upon the nature of each of these ‘malign planes,’ beginning with fallible decisions.

  Figure 7.5. The various human contributions to the breakdown of complex systems are mapped onto the basic elements of production. It is assumed that the primary systemic origins of latent failures are the fallible decisions taken by top-level plant and corporate managers. These are then transmitted via the intervening elements to the point where system defences may be breached.

  10.2.1. Fallible decisions

  A basic premise of this framework is that systems accidents have their primary origins in fallible decisions made by designers and high-level (corporate or plant) managerial decision makers.

  This is not a question of allocating blame, but simply a recognition of the fact that even in the best-run organisations a significant number of influential decisions will subsequently prove to be mistaken. This is a fact of life. Fallible decisions are an inevitable part of the design and management process. The question is not so much how to prevent them from occurring, as how to ensure that their adverse consequences are speedily detected and recovered.

  In considering fallible decisions, it is important to be aware of the context in which high-level decisions are taken. Figure 7.6 summarises some of the constraints facing corporate and senior plant managers. All organizations have to allocate resources to two distinct goals: production and safety. In the long term, these are clearly compatible goals. But, given that all resources are finite, there are likely to be many occasions on which there are short-term conflicts of interest. Resources allocated to the pursuit of production could diminish those available for safety; the converse is also true. These dilemmas are exacerbated by two factors:

  (a) Certainty of outcome. Resources directed at improving productivity have relatively certain outcomes; those aimed at enhancing safety do not, at least in the short term. This is due in large part to the large contribution of stochastic elements in accident causation.

  (b) Nature of the feedback. The feedback generated by the pursuit of production goals is generally unambiguous, rapid, compelling and (when the news is good) highly reinforcing. That associated with the pursuit of safety goals is largely negative, intermittent, often deceptive and perhaps only compelling after a major accident or a string of incidents. Production feedback will, except on these rare occasions, always speak louder than safety feedback. This makes the managerial control of safety extremely difficult.

  Furthermore, decision makers do not always interpret feedback on either the production or the safety channels accurately. Defensive ‘filters’ may be interposed that both protect them from bad news and encourage extrapunitive reactions. Thus, poor achievement on the production front can be blamed upon an inadequate workforce, union interference, market forces, world recession, shortages of materials and the like. A bad safety record can be attributed to operator carelessness or incompetence. This position is sometimes consolidated by cataloguing the various safeguards, engineered safety devices and safe operating practices that have already been implemented. Indeed, given the almost diabolical nature of some accident sequences, these are perfectly understandable reactions. But they nevertheless block the discovery of effective remedies and contribute to further fallible decisions.

  Figure 7.6. A summary of some of the factors that contribute to fallible, high-level decision making. Resources allocated to production and safety goals differ (a) in their certainty of outcome, and (b) in the nature and impact of their respective feedback.

  10.2.2. Line management deficiencies

  On this ‘plane’, the consequences of fallible decisions manifest themselves differently in the various line management departments. Of course, it would be naive to assume that the pathology or otherwise of a given line department is purely a function of higher-level decision making. The native incompetence of any set of line managers could further exacerbate the adverse effects of high-level decisions or even cause good decisions to have bad effects. Conversely, competence at the line management level could do something to mitigate the unsafe effects of fallible decisions, make neutral decisions have safer consequences, and transform good decisions into even better ones. Nevertheless, the scope for line management intervention is, in very real terms, constrained by the size of their departmental budgets or the resources at their disposal. For theoretical purposes, we will assume that these allocations will have been decided at a higher level in the system. Indeed, such allocations constitute a major part of the output of higher-level decision making.

  The interaction between line management deficiencies and the psychological precursors of unsafe acts is extremely complex. There is a many-to-many mapping between possible line management deficiencies and the various psychological precursors of unsafe acts. For example, deficiencies in the training department can manifest themselves as a variety of preconditions: high worklo
ad, undue time pressure, inappropriate perception of hazards, ignorance of the system and motivational difficulties. Likewise, any one precondition (e.g., undue time pressure) could be the product of many different line management deficiencies (e.g., poor scheduling, poor procedures, deficiencies in skills, rules, or knowledge and maintenance inadequacies).

  A useful way of thinking about these transformations is as failure types converting into failure tokens (Hudson, 1988). Deficient training is a pathogen type that can reveal itself, on the precondition plane, as a variety of pathogenic tokens. Such a view has important remedial consequences. Rectifying a particular failure type could, in principle, remove a wide and varied class of tokens. The type-token distinction is intrinsically hierarchical. Condition tokens at this level of analysis become types for the creation of unsafe act tokens at our next stage of analysis.

  10.2.3. Preconditions for unsafe acts

  Preconditions or psychological precursors are latent states. They create the potential for a wide variety of unsafe acts. The precise nature of these acts will be a complex function of the task being performed, the environmental influences and the presence of hazards. Each precursor can contribute to a large number of unsafe acts, depending upon the prevailing conditions.

  At this level, the type-token distinction becomes especially significant due to the some-to-many mapping between precursors and unsafe acts. A particular psychological precursor, either alone or in combination with other precursors, can play a significant part in both provoking and shaping an almost infinitely large set of unsafe acts. But the precise nature, time, place and perpetrator of any single act are almost impossible to anticipate, though we can apply some general predictive principles.

  The stochastic character of this onward mapping reveals the futility of ‘tokenism’—the focusing of remedial efforts upon preventing the recurrence of specific unsafe acts. Although certain of these acts may fall into an easily recognisable subclass (e.g., failing to wear personal safety equipment in the presence of hazards) and so be amenable to targeted safety programmes and training, most of them are unforeseeable, sometimes even quite bizarre. The only sensible way of dealing with them is, first, to eliminate their preconditions as far as possible and, second, to accept that whatever the measures taken, some unsafe acts will still occur, and so provide defences that will intervene between the act and its adverse consequences.

  As in the case of line management deficiencies, not all unsafe act precursors result from fallible decisions. Many of the pathogens at this level are introduced directly by the human condition. The capacities for being stressed, failing to perceive hazards, being imperfectly aware of the system and having less than ideal motivation are brought by each person into the workplace. Thus, in causal terms, there is only a loose coupling between the line management and precursor ‘planes’. The point to stress is that these predispositions can either be markedly exaggerated or considerably mitigated by the character of decisions made at the top levels of the system and communicated to the individual via line departments. Even the best-run organisations cannot eliminate the harmful psychological effects of negative life events (e.g., marriage breakdowns, sickness in the family, bereavements, etc.) occurring outside the workplace. But they can anticipate the possibility if not the particular form of occurrence of negative life events and provide adequate defences against their unsafe consequences.

  10.2.4. Unsafe acts

  Even more than their psychological precursors, the commission of unsafe acts is determined by a complex interaction between intrinsic system influences (of the kind described for the preceding three ‘planes’) and those arising from the outside world. This has to do both with protean environmental factors and with the particular form of the existing hazards. Thus, an unsafe act can only be defined in relation to the presence of a particular hazard. There is nothing inherently unsafe about not wearing a safety helmet or a life jacket. Such omissions only constitute unsafe acts when they occur in potentially hazardous situations (i.e., when heavy objects are likely to fall from above, or in close proximity to deep water). An unsafe act is more than just an error or a violation—it is an error or a violation committed in the presence of a potential hazard: some mass, energy or toxicity that, if not properly controlled, could cause injury or damage. A classification of unsafe acts based upon arguments presented earlier in this book is shown in Figure 7.7.

  10.2.5. Defences: The limited window of accident opportunity

  A system’s defences can be made up of many elements. At the lowest level of sophistication, they may consist of little more than personal safety equipment for the workforce and guards preventing direct contact with dangerous materials or moving parts. At the other extreme, there are the ‘defences in depth’ of nuclear power plants. These comprise both people (the control room operators) and many (both redundant and diverse) engineered features such as automatic safety devices and levels of containment.

  Figure 7.7. A summary of the psychological varieties of unsafe acts, classified initially according to whether the act was intended or unintended and then distinguishing errors from violations.

  Figure 7.8. The dynamics of accident causation. The diagram shows a trajectory of accident opportunity penetrating several defensive systems. This results from a complex interaction between latent failures and a variety of local triggering events. It is clear from this figure, however, that the chances of such a trajectory of opportunity finding loopholes in all of the defences at any one time is very small indeed.

  Very few unsafe acts result in actual damage or injury, even in relatively unprotected systems. And in highly protected systems, the various layers of defence can only be breached by the adverse conjunction of several different causal factors. Some of these are likely to be latent failures of pathogenic origin, others will be local triggering events such as the commission of a set of unsafe acts in a highly specific set of circumstances—often associated with some atypical system condition (i.e., the unusually low temperature preceding the Challenger lunch, the testing carried out prior to the annual shut-down at Chernobyl-4, and the nose-down trim of the Herald of Free Enterprise due to a combination of high tide and unsuitable docking facilities).

  Figure 7.8 tries to capture some of the stochastic features involved in the unlikely coincidence of an unsafe act and a breach in the system’s defences. It shows a trajectory of opportunity originating in the higher levels of the system, passing via the precondition and unsafe act planes and then on through three successive layers of defence. Each of these planes has windows of opportunity, but they are in continual flux due to the largely unpredictable influences of both intrinsic and extrinsic factors. On each plane, the areas of permeability or windows vary over time in both their location and their size, and these changes have different time constants at different levels of the system. This picture emphasises the unlikelihood of any one set of causal factors finding an appropriate trajectory.

  In a highly defended system, one of the most common accident scenarios involves the deliberate disabling of engineered safety features by operators in pursuit of what, at the time, seems a perfectly sensible goal (i.e., Chernobyl), but that fails to take account either of the side effects of these actions or of system characteristics. On other occasions, the defences are breached because the operators are unaware of concurrently created gaps in system security (i.e., at TMI-2, Bhopal, Herald) because they have an erroneous perception of the system state.

  10.3. Controlling safer operations

  The control of safe operations, like the control of production, is a continuous process. The prerequisites for adequate safety control are: (a) a sensitive multichannel feedback system, and (b) the ability to respond rapidly and effectively to actual or anticipated changes in the safety realm. These two aspects of control—feedback and response—are considered further below.

  Figure 7.9 portrays the feedback loops and indicators potentially available to those responsible for the management of system saf
ety. Together these various feedback loops and indicators constitute the safety information system (SIS).

  It has been shown that an effective safety information system ranks second only to top management involvement in discriminating between safe and unsafe organisations matched on other variables (Kjellen, 1983).

  Loop 1 (the reporting of accidents, lost time injuries, etc.) is the minimum requirement for an SIS. In most cases, however, the information supplied is too little and too late for effective anticipatory control. It is too little because such indices as fatalities and LTIs are simply the tip of the event iceberg. It is too late because they are retrospective; the events that safety management seeks to eliminate have already occurred.

  Figure 7.9. Feedback loops and indicators. The indicators are divided into two groups: faliure types (relating to deficiencies in the managerial/organisational sectors) and failure tokens (relating to individual conditions and unsafe acts).

  Loop 2 is potentially—though rarely actually—available through such procedures as unsafe act auditing (Shell Safety Committee, 1987). Usually the information derived from such auditing is disseminated only at the lower levels of the organisation. However, since unsafe acts are the stuff from which accidents are made, a feedback loop that samples the incidence and nature of unsafe acts in various operational spheres would provide a greater opportunity for proactive safety control.

 

‹ Prev