by James Reason
2.2. Divided attention and resource theories
In the selective attention tasks discussed above, subjects were required to focus their attention upon one of the available sources of information and to exclude the others. In divided attention studies, they are expected to distribute their attention between the assigned activities and to deal with each of them to the best of their ability. This type of experimental paradigm was mainly responsible for the development of resource theories of attention.
Resource theory is the product of several independent strands of research (Knowles, 1963; Moray, 1969; Kahneman, 1973; Norman & Bobrow, 1975; Navon & Gopher, 1979). In its basic form, this theory—or, more accurately, set of theories—assumes that attention can be usefully regarded as a single reservoir of information-processing resources that is equally available to all mental operations. More sophisticated variants invoke a multiplicity of non-overlapping reservoirs (see Wickens, 1983).
This view of attention as a finite but highly flexible control resource removes the need to postulate some specialised filterlike mechanism to account for selection. In resource theory, selection is implicit in the restricted nature of the attentional commodity; it can only be deployed in relation to a limited set of entities, though the possible claims upon it are legion. Thus, only those events, task elements or ideas that receive some critical allocation of this resource will achieve deeper levels of processing.
As stated earlier, a large part of the experimental literature underpinning resource theory has been concerned with people’s ability to divide attention between two concurrent tasks. It is possible to envisage a continuum of dual-task situations, running from one extreme at which both tasks are wholly automatic, and thus make little or no claim upon attention, to an opposite extreme where the demands imposed by each task are so high that they can hardly be performed adequately, even in isolation. Most studies have been concerned with the middle ground. Within this region, the crucial factor is task similarity. The greater the similarity between the components of the two tasks, the more likely they are to call upon the same processing resources at the same time, and thus produce mutual interference.
However, the more practised people are at handling the two activities at the same time, the more proficient they become at responding to just those features that differentiate the two tasks, thus reducing their initial similarity and, with it, the likelihood of interference. As Kinsbourne (1981, p. 66) stated: “Over time, processing will descend the hierarchy from limited problem solving towards a more automatic, less attention requiring mode. Thus it is in the nature of the attentive processes to generate their own extinction.”
The results of several studies (see Wickens, 1980) indicate that interference can occur at several different stages of the information-handling sequence rather than at a single critical point. As Broadbent (1982) put it: “The main interference between two tasks occurs at the point where they compete most for the same functions.”
Dual-task interference can show itself in a variety of ways: as a complete breakdown of one of the activities, as a lowering of performance and/or slowing in the rate of responding of one or both activities and as ‘cross-talk’ errors in which elements of one task either bias responses in the other or actually migrate from one activity to the other (Long, 1975; Kinsbourne & Hicks, 1978).
2.3. Multichannel processor theories
A number of studies have demonstrated that highly skilled individuals performing essentially different concurrent tasks show minimal interference (Allport, Antonis & Reynolds, 1972; Shaffer, 1976; Spelke, Hirst & Neisser, 1976; Hirst, Spelke, Reves, Caharack & Neisser, 1980). These results have given rise to a view of human information processing known as the multichannel processor theory. Alan Allport has been one of the most active proponents of this viewpoint. He presented an early version of it as follows:
In general, we suggest, any complex task will depend on the operation of a number of independent, specialised processors, many of which may be common to other tasks. To the extent to which the same processors are involved in any two particular tasks, truly simultaneous performance of these two tasks will be impossible. On the other hand, the same tasks paired respectively with another task requiring none of the same basic processors can in principle be performed in parallel with the latter without mutual interference. (Allport, Antonis & Reynolds, 1972)
More recently, Allport (1980a, 1980b) has criticised cognitive theories (especially bottleneck and resource theories) that assume the existence of a general-purpose, limited-capacity central processor, or GPLCCP for short. The largely tacit belief underlying the research described earlier, namely that there are ‘hardware’ constraints upon the amount of information that can be processed by the cognitive system at any given time, is not, Allport maintained, the only conclusion that can be drawn from the work on divided attention. What interpretation is placed upon these findings depends largely upon the assumption held regarding cognitive ‘architecture’. The greater part of the research on cognitive limitations has assumed a hierarchical, multilevel architecture in which the topmost level (the GPLCCP) needs processing constraints in order to focus selectively upon particular kinds of information.
Allport’s claim was that this view of the cognitive apparatus is limited by old-fashioned notions of computer architecture. Cognitive psychologists were quick to employ a range of computer metaphors for human information processing in the 1950s and 1960s; but, Allport (1980a, p. 27) argues, these ideas were “derived from the basic design of the sequential, general-purpose digital computer itself, rather than from the potential computational processes that might be implemented on it.”
Cognitive psychology, Allport demanded, should eschew models based on general-purpose digital computers and abandon its assumptions about content-independent processors. Also consigned to the dustbin should be any notion of quantitative limitations upon the information to be processed. In their place should be constructed models based upon distributed, heterarchical architectures in which reside a community of specialised processors, or ‘experts’, who are capable of managing their affairs without any overall ‘boss’, GPLCCP or ‘central executive’. As we shall see later, such models have indeed flourished in the mid to late 1980s.
2.4. The properties of primary memory
Most of the scientific literature relating to memory is derived from laboratory studies in which people are tested on their memory for some recently learned list of items, usually comprising nonsense syllables (Ebbinghaus, 1885), digit strings or individual words. The appeal of such studies is that they permit a large measure of control over the to-be-remembered material as well as over the conditions of learning and recall. By manipulating these variables systematically and observing their effects upon memory performance, the investigator is able to make reasonably confident statements about the causes of the observed effects. The penalty paid for this degree of control is that the results of these studies tell us relatively little about the way people actually use their memories. Nevertheless, they have taught us a great deal about the properties of the ‘sharp end’ of the memory system: primary or short-term memory (STM).
Since Wundt’s observations in 1905, it has been known that most people have an immediate memory span for around seven unrelated items. More recent studies have shown that this span can be altered in various ways.
Span is increased when the to-be-remembered items are grouped in a meaningful way. Miller (1956) called this chunking. Thus, the span for letters increases to around ten when they are presented as consonant-vowel-consonant nonsense syllables (e.g., TOK, DEX, DAS, etc.), and rises to between 20 and 30 when letters form part of sentences. Miller argued that the capacity for immediate memory was seven plus or minus two chunks, regardless of the number of individual items per chunk. In other words, STM can be thought of as comprising a limited number of expansible ‘bins’, each containing a variably-sized chunk of information.
Conversely, span may be dramatically
reduced by the acoustic similarity effect (Conrad, 1964). The span for sound-alike items is markedly less than that for phonologically dissimilar items, even when the material is presented visually. In one study (Baddeley, 1966), only 9.6 per cent of similar sentences were correctly recalled, compared to 82.1 per cent for acoustically dissimilar sentences.
There is also a clear relationship between memory span and word length under normal conditions of immediate recall (Baddeley, Thomson & Buchanan, 1975). When subjects are required to recall sequences of five words of differing length, there was a 90 per cent correct recall for one-syllable words and only 50 per cent for five-syllable words. In addition, there was a close correspondence between these recall scores and reading rates. This suggested that the word length effect reflects the speed at which the subjects could rehearse the words subvocally.
Both the acoustic similarity and the word length effects suggest that STM relies heavily on acoustic or phonological coding. In addition, it has long been known that the latter part of a sequence presented auditorally is better recalled than one presented visually (Von Sybel, 1909). This enhanced recency effect for vocalised items can be eliminated by the addition of an irrelevant item to the end of the sequence. Thus, people are better able to recall the sequence 8-5-9-6-3-1-7 than 8-5-9-6-3-1-7--0, where the terminal zero does not have to be remembered, but merely acts as a recall instruction (Conrad, 1960; Crowder & Morton, 1969). To exert this influence, however, the ‘suffix’ has to be speechlike. A buzzer sounded at the same point has no effect, nor does the semantic character of the suffix appear to matter. However, this suffix effect can be greatly reduced by introducing a brief pause between the end of the to-be-remembered items and the suffix.
2.5. The concept of working memory
One view of STM that has gained wide acceptance over the past decade is that of working memory (Baddeley & Hitch, 1974; Baddeley, 1976; Hitch, 1980). Instead of treating STM as a temporary store intervening between the perceptual processes and long-term memory, the working memory (WM) concept is both more broadly defined and more differentiated than its predecessors (see Atkinson & Shiffrin, 1968).
In its most recent form, WM is conveniently divided into three components: a central executive that acts as a limited capacity control resource and is closely identified with both attention and consciousness and two ‘slave systems’, the articulatory loop and the visuospatial scratchpad.
Although the articulatory loop and the visuospatial scratchpad store different kinds of information, both subsystems are assumed to have an essentially similar structure. Each comprises two elements: a passive store and an active rehearsal process. While both are under the control of the central executive, they are nevertheless capable of some degree of independent function, that is, without conscious attention being given to their operations. Thus, the articulatory loop is able to store a small quantity of speechlike material—about three items in the appropriate serial order or the amount of material that can be repeated subvocally in about one and a half seconds—without involving the central executive (Baddeley & Hitch, 1974).
In the articulatory loop, the passive store is phonological and is accessed directly by any auditory speech input. This is an obligatory process; speech inputs will always displace the current contents of the passive store. The passive store can also be accessed optionally via the active-rehearsal element. The acoustic similarity effect, noted earlier, can be explained on the basis of the confusion within the passive store of phonologically-similar items. Memory span will be a function of both the durability of item traces within the passive store and the rate at which the contents can be refreshed by subvocal rehearsal. Since short words can be said ‘under the breath’ more quickly than longer words, more of them can be rehearsed before their traces fade from the passive store. This provides a satisfactory account of the word-length effect.
While there are still many unresolved questions concerning the precise nature of WM, it has certain features that make it congenial to the cognitive theorist. Not only does it account reasonably well for most of the basic findings of earlier STM research, it also links, albeit sketchily, the functions of the slave systems to that of the central executive (which still survives in many theories despite Allport’s attack). In other words, WM plays a crucial part in controlling the current ‘working database’, regardless of whether the information contained therein has arrived by way of the senses or whether it has been ‘called up’ from long-term memory in the course of reasoning, thinking or performing mental arithmetic. In this way, it also breaks free of the dead hand of earlier ‘pipe-line’ models of human information processing (see Broadbent, 1984) in which the direction of the data flow is largely one way, from the sensory input through to motor output. Central to the WM concept is the idea of a mental workspace that shares its limited resources between processing and storage.
There is now considerable evidence to show that the primary constraints upon such cognitive activities as arithmetical calculation (Hitch, 1978), concept formation (Bruner, Goodnow & Austin, 1956), reasoning (Johnson-Laird, 1983; Evans, 1983) and diagnostic ‘troubleshooting’ by electronic technicians (Rasmussen, 1982) stem from WM limitations. The WM concept holds that there is competition between the resources needed to preserve temporary data generated during processing (e.g., the intermediate stage of a mental arithmetic calculation) and those consumed by the processing itself. Many of the characteristic strategies and shortcuts employed by people during the course of protracted and demanding cognitive activities can be viewed as devices for easing the burden upon working memory; or, in other terms, as ways of minimising cognitive strain (Bruner et al., 1956).
3. The cognitive science tradition
3.1. Contemporary schema theorists
After nearly 30 years of lying dormant, Bartlett’s schema concept reemerged in three disparate publications in the same year: Minsky (1975) writing on computer vision, Rumelhart (1975) on the interpretation of stories, and Schmidt (1975) in the context of motor skill learning. So appeared the new schema theorists.
Although they vary widely in their terminology and applications, schema theories, both ancient and modern, reject an atomistic view of mental processes, and maintain “that there are some phenomena that cannot be accounted for by a concatenation of smaller theoretical constructs, and that it is necessary to develop larger theoretical entities to deal with these phenomena” (Brewer & Nakamura, 1983). One way of catching the spirit of this revival is by looking briefly at the work of two of the most influential theorists: Minsky and Rumelhart. Minsky was primarily concerned with perception and with the way schemata guide the encoding and storage of information. Rumelhart’s interest was in text comprehension and memory for stories. Both advanced essentially similar ideas.
Common to both theories is the idea that schemata are high-level knowledge structures that contain informational ‘slots’ or variables. Each slot will only accept a particular kind of information. If the current inputs from the world fail to supply specific data to fill these slots, they take on ‘default assignments’: stereotypical values derived from past transactions with the world. As will be seen in later chapters, this idea of reverting to ‘default assignments’ is central to the main thesis of this book.
Minsky was concerned with the computer modelling of pattern recognition. He argued that adequate recognition of three-dimensional scenes was impossible on the basis of momentary input patterns alone. He proposed that the computer, like human cognition, must be ready for each scene with preconceived knowledge structures that anticipate much of what would appear. It was only on this basis, he argued, that human perception could operate with such adaptive versatility.
In Minsky’s terminology, commonly encountered visual environments, such as rooms, are represented internally by a frame, containing nodes for standard features such as walls, floors, ceilings, windows and the like, and slots for storing the particular items relating to a certain kind of room. Thus, if people are shown th
e interior of a room very briefly, their subsequent description of its layout and contents probably will be biased more towards a prototypical room than the sensory evidence would warrant. For example, if their glance took in the presence of a clock upon the wall, it is probable that, if pressed, they would report that the clock had hands, even though, on this particular occasion, none were present.
The very rapid handling of information characteristic of human cognition is possible because the regularities of the world, as well as our routine dealings with them, have been represented internally as schemata. The price we pay for this largely automatic processing of information is that perceptions, memories, thoughts and actions have a tendency to err in the direction of the familiar and the expected.
Rumelhart defined schemata as “data structures for representing generic concepts stored in memory” (Rumelhart & Ortony, 1977, p. 101). Like Minsky, he asserted that schemata have variables with constraints and that these variables (or slots) have a limited distribution of possible default values. In addition, he stressed the embedded nature of schemata. High-level schemata will have lower-level schemata as subparts, the whole nesting together like Russian dolls, one within another. Thus, an office building schema will have office schemata as sub-parts. Similarly, an office schema is likely to include desks, filing cabinets, typewriters and so on as subcomponents.
Rumelhart also attempted to clarify the nature of the interactions between incoming episodic information and the generic information embodied in the schemata: the relationships between new and old knowledge. Rumelhart and Ortony (1977) argued that “once an assignment has been made, either from the environment, from memory, or by default, the schema is said to have been instantiated.” Only instantiated schema get stored in memory. During the process of recall, generic information may be used to further interpret and reconstruct a particular memory from the original instantiated schema record.