by James Reason
3.6. General Problem Solver
One of the most infuential contributions to our current understanding of the ‘pathologies’ of human problem solving has come from Newell and Simon’s (1972) rule-based computer model, General Problem Solver (GPS). Their data on human problem solving were obtained from verbal protocols. They asked their subjects to write down any tentative solutions they had reached and to ‘think aloud’ as they solved a problem. From these data, they produced a general model of problem solving designed to prove theorems and to solve logical and mathematical problems. Of more concern here, however, are their analytical tools.
Newell and Simon began with the notion of a problem space. This consists of a set of possible states of knowledge about the problem, and a set of operators that are applied to these states to produce new knowledge. A problem is posed by giving an initial state of knowledge and then asking the subject to find a path to a final state of knowledge that includes the answer to the problem. The path to the problem solution is characterized by problem behaviour graph that tracks the sequence of states of knowledge through which the subject passes on his or her way from the initial to the final state and by the operators applied to move him or her along this path.
The basic problem-solving strategy of GPS is the means-ends analysis. This involves setting a high-level goal, looking for differences between the current state of the world and the goal state, looking for a method that would reduce this difference, setting as a subgoal the application of that method and then recursively applying means-ends analyses until the final state is reached.
3.7. Rasmussen’s skill-rule-knowledge framework
Like the Norman-Shallice model, Rasmussen’s account of cognitive control mechanisms is error oriented. But there are some important differences in the provenance of these models. Whereas Norman and Shallice were primarily concerned with accounting for the usually inconsequential action slips that occur in the normal course of daily life, Rasmussen’s model is primarily directed at the far more serious errors made by those in supervisory control of industrial installations, particularly during emergencies in hazardous process plants. It was also markedly influenced by GPS and its associated ‘thinking out loud’ methodology.
The skill-rule-knowledge framework originated from a verbal protocol study of technicians engaged in electronic ‘troubleshooting’ (Rasmussen & Jensen, 1974). This tripartite distinction of performance levels has effectively become a market standard within the systems reliability community. The three levels of performance correspond to decreasing levels of familiarity with the environment or task.
3.7.1. Skill-based level
At the skill-based level, human performance is governed by stored patterns of preprogrammed instructions represented as analogue structures in a time-space domain. Errors at this level are related to the intrinsic variability of force, space or time coordination.
3.7.2. Rule-based level
The rule-based level is applicable to tackling familiar problems in which solutions are governed by stored rules (productions) of the type if (state) then (diagnosis) or if (state) then (remedial action). Here, errors are typically associated with the misclassification of situations leading to the application of the wrong rule or with the incorrect recall of procedures.
3.7.3. Knowledge-based level
The knowledge-based level level comes into play in novel situations for which actions must be planned on-line, using conscious analytical processes and stored knowledge. Errors at this level arise from resource limitations (‘bounded rationality’) and incomplete or incorrect knowledge. With increasing expertise, the primary focus of control moves from the knowledge-based towards the skill-based levels; but all three levels can co-exist at any one time.
Rasmussen identified eight stages of decision making (or problem solution): activation, observation, identification, interpretation, evaluation, goal selection, procedure selection and activation. Whereas other decision theorists represent these or comparable stages in a linear fashion, Rasmussen’s major contribution has been to chart the shortcuts that human decision makers take in real-life situations. Instead of a straight-line sequence of stages, Rasmussen’s model is analogous to a step ladder, with the skill-based activation and execution stages at the bases on either side, and the knowledge-based interpretation and evaluation stages at the top. Intermediate on either side are the rule-based stages (observation, identification, goal selection and procedure selection). Shortcuts may be taken between these various stages, usually in the form of highly efficient but situation-specific stereotypical reactions, where the observation of the system state leads automatically to the selection of remedial procedures without the slow and laborious intervention of knowledge-based processing. The ‘step-ladder’ model also allows for associative leaps between any of the decision stages.
3.8. Rouse’s fuzzy rule’ model
Rouse’s problem-solving model (Rouse, 1981; Hunt & Rouse, 1984), like Rasmussen’s (from which it is partly derived), is based upon a recurrent theme in the psychological literature: “humans, if given the choice, would prefer to act as context-specific pattern recognisers rather than attempting to calculate or optimize” (Rouse, 1981). The model is a product not only of the extraordinary facility with which human memory encodes, stores and subsequently retrieves a virtually limitless set of schematic representations, but also of bounded rationality, ‘satisficing’, and persistence-forecasting.
The model’s indebtedness to GPS is revealed in its assumption that knowledge is stored in a rule-based format (see also J. R. Anderson’s, 1983, ACT* framework for cognition). Following Rasmussen (1981), Rouse distinguished two kinds of problem-solving rules: symptomatic and topographic, each of the form if (situation) then (action). These rules or production systems link two schematic components: a stored pattern of information relating to a given problem situation, and a set of motor programs appropriate for governing the corrective actions. A rule is implemented when its situational component matches either the actual state of the world or some hypothesized representation of it (a mental model).
The distinction between symptomatic and topographic rules stemmed from Rasmussen’s observation of two distinct search strategies adopted by electronic technicians in fault-finding tasks. With a symptomatic strategy, identification of the problem is obtained from a match between local system cues and some stored representation of the pattern of indications of that failure state. This, in turn, is frequently bound together in a rule-based form with some appropriate remedial procedure. In topographic search the diagnosis emerges from a series of good/bad judgements relating to the location or sphere of influence of each of the system components. This mode of search depends upon some plan or mental model of the system, which provides knowledge as to where to look for discrepancies between ideal and actual states.
Thus, these two sets of rules differ in two major respects: in their dependency upon situation-specific as opposed to context-free information and in their reliance upon preexisting rules as distinct from rules derived through knowledge-based processing. Symptom rules (S-rules) are rapid and relatively effortless in their retrieval and application, since they only require a match between local cues and the situational component of the stored rule. An example of an S-rule for diagnosing a car fault might be: If (the engine will not start and the starter motor is turning and the battery is strong) then (check the petrol gauge). Topographic rules (T-rules) need not contain any reference to specific system components, but they demand access to some mental or actual map of the system and a consideration of the structural and functional relationships between its constituent parts. An example might be: If (the output of Xis bad and X depends on Yand Z, and Yis known to be working) then (check Z).
The key feature of the model is the assertion that problem solvers first attempt to find and apply S-rules before attempting the more laborious and resource-consuming search for T-rules. Only when the search for an S-rule has failed will the problem solver seek
for an appropriate T-rule.
An extremely attractive aspect of Rouse’s model is that it can be expressed and modelled mathematically using fuzzy set theory. For a rule to be selected, four criteria must be satisfied in some degree.
(a) The rule must be recallable (available).
(b) The rule must be applicable to the current situation.
(c) The rule must have some expected utility.
(d) The rule must be simple.
Since, from a human problem-solving perspective, these criteria are not always distinguishable by binary-valued attributes, they are best regarded as constituting fuzzy sets. Hence, each rule must be evaluated according to the possibility of its membership in the fuzzy sets of available, applicable, useful and simple rules. Hunt and Rouse (1984) give the equations governing each rule set. The important point to emphasise is that rule selection is heavily influenced by the frequency and recency of its past successful employment (see also Anderson, 1983).
Hunt and Rouse (1984) have evaluated the model using a simulated fault-diagnosis task. The model was able to match 50 per cent of the human problem solvers’ actions exactly, while using the same rules approximately 70 per cent of the time. Of particular interest was the finding that T-rules took, on average, more than twice as long as S-rules to execute. It was also found that removing S-rules brought about a marked degradation in the model’s ability to mimic human performance. In addition, it was found necessary to include not only situation-action rules, but also situation-situation rules (i.e., if (situation) then (situation)). The latter produced no overt action, but served to update the model’s knowledge base.
A qualitative variant of this model has also been used by Donnelly and Jackson (1983) to analyse the causes of eight electrical contact accidents suffered by Canadian linesmen and maintenance crews. Four of these involved errors in rule-based performance, and two involved errors in knowledge-based performance. In three accidents, there were failures in identifying hazard cues, and in two of these three cases short-term memory failures were implicated.
The model incorporates at least two error tendencies: ‘place-losing’ and ‘strong schema capture’. Because of the intrinsically recursive nature of the process, problem solvers need a ‘stack’ or working memory to keep track of where they have been within the problem space and where they are going next. Since short-term memory is extremely limited in its capacity, there is the strong likelihood that problem solvers will lose items from the ‘stack’ along the way. This will lead to the omission of necessary steps, the unnecessary repetition of previously executed steps and tangential departures from the desired course.
The second and more predictable potential for error is the inappropriate acceptance of readily available but irrelevant patterns. As will be considered at length in the next chapter, many factors conspire together to facilitate this erroneous acceptance of ‘strong-but-wrong’ schemata.
3.9. The new connectionism: Parallel distributed processing
All rule-based models assume that human cognition possesses some central processor or working memory through which information must be processed serially. One of the most significant developments of the 1980s has been the appearance of radically different view of the ‘architecture’ of cognition: one that rejects the need for a central processor and maintains that human memory is organised as a parallel distributed processing system (Hinton & Anderson, 1981; Norman, 1985; McClelland & Rumelhart, 1985; Rumelhart & McClelland, 1986).
Parallel distributed processing (PDP) models are neurologically inspired and posit the existence of a very large number of processing units that are organised into modules. These elements are massively interconnected and interact primarily through the activation and inhibition of one another’s activity. Although the processing speed of each unit is relatively slow, the resulting computations are far faster than the fastest of today’s computers. As Norman (1985) put it: “Parallel computation means that a sequence that requires millions of cycles in a conventional, serial computer can be done in a few cycles when the mechanism has hundreds of thousands of highly interconnected processors.”
One aim of recent PDP models (McClelland & Rumelhart, 1986) has been to resolve a central dilemma in theories of human memory: How does memory represent both the generalised (or prototypical) aspects of classes of objects or events and the specific features of individual exemplars? Norman (1985) has given an eloquent account of how the PDP modellers tackle this problem.
The PDP approach offers a very distinctive counter-approach [to rule-based or concept-based models]. Basically, here we have an adaptive system, continually trying to configure itself so as to match the arriving data. It works automatically, pre-wired, if you will, to adjust its own parameters so as to accommodate the input presented to it. It is a system that is flexible, yet rigid. That is, although it is always trying to mirror the arriving data, it does so by means of existing knowledge, existing configurations. It never expects to make a perfect match, but instead simply tries to get the best match possible at any time: The better the match, the more stable the system. The system works by storing particular events on top of one another: Aspects of different events co-mingle. The result is that the system automatically acts as if it has formed generalizations of particular instances, even though the system only stores individual instances. Although the system develops neither rules of classification nor generalizations, it acts as if it had these rules. It is a system that exhibits intelligence and logic, yet nowhere has explicit rules of intelligence or logic.
Although, as PDP theorists (McClelland & Rumelhart, 1986) concede, this ‘headless’ view of memory processing (at least in its present form) fails to explain the coherence and ‘planfulness’ of human cognition, it is extremely good at accounting for the ‘graceful degradation’ of overstretched human performance, and for the appearance of recurrent error forms. In particular, it can provide credible explanations for the similarity, frequency and confirmation biases considered in detail in later chapters (see also Norman, 1986). As will be seen, the ‘spirit’ of PDP theorizing, if not the precise ‘letter’, is a central feature of the computational model of human error production described in Chapter 5.
3.10. Baars’s global workspace (GW) model
The global workspace (GW) model (most fully described in Baars, 1988) is concerned with explaining the many pairs of apparently similar phenomena that seem to differ only in that one is conscious and the other is not (minimally contrastive pairs). Although the model takes a predominantly ‘information-processing’ stance, with special attention given to cognitive failures, it also embraces both neurophysiological and clinical evidence.
Human cognition is viewed as a parallel distributed processing system (see Hinton & Anderson, 1981; Rumelhart & McClelland, 1986) made up of a large assembly of specialised processors (or specialists) covering all aspects of mental function. As in the Norman-Shallice model, these processors do not require any central mechanism to exercise executive control over them; each may decide by its own local criteria what is worth processing. But they do need a ‘central information exchange’ in order to coordinate their activity with regard to various nested goal structures (plans).
This role is fulfilled by the global workspace, a ‘working memory’ that allows specialist processors to interact with each other. Specialists compete for access to the global workspace on the basis of their current activation level. Once there, they can ‘broadcast’ a message throughout the system to inform, recruit or control other processors. The stable components of a global representation are described as a context, so called because they prompt other processors to organize themselves according to these local constraints. Consciousness reflects the current contents of the global workspace. As in other models, GW is closely identified with short-term memory and the limited-capacity components of the cognitive system.
4. Current trends in cognitive theorising
A little over 60 years ago, Charles Spearman felt that he
could not achieve a serviceable definition of intelligence until he had first established “the framework of the entire psychology of cognition” (Spearman, 1923, p. 23). Some 40 years later, when cognition reemerged as a major field of psychological enquiry, its leading exponents turned their backs on such large-scale ambitions. Instead, they favoured an abundance of small, data-bound theories yielding specific predictions tailored to the controlled manipulation of well-defined laboratory phenomena. To get ahead in the largely academic cognitive psychology of the 1960s and 1970s, it was necessary to conduct experiments that tested between two or more (but usually two) currently fashionable theories within some well-established paradigm. The product of this pursuit of ‘binarism’ has been “a highly elaborated base of quantitative data over many diverse phenomena, with many local theories” (Card, Moran & Newell, 1983).