by Morton Hunt
A number of other psychologists, though not saying they were disillusioned, sought to broaden the IP view to include the mind’s use of schemas, shortcuts, and intuitions, and its ability to function simultaneously on both the conscious and unconscious levels to conduct simultaneous processes in parallel (a critical issue we shall hear more of in a moment).
Still others challenged the notion that computors programmed to think like humans actually think. AI, they maintained, isn’t anything like human intelligence, and though it may vastly outperform the human mind at calculations, it would never do easily, or at all, many things the human mind does routinely and effortlessly.
The most important difference is the computer’s inability to understand what it is thinking about. John Searle and Hubert Dreyfus, both philosophy professors at Berkeley, the computer scientist Joseph Weizenbaum at MIT, and others argued that computers, even when programmed to reason, merely manipulate symbols without having any idea what they mean and imply. General Problem Solver, for instance, may have figured out how the father and two sons could get across the river, but only in terms of algebraic symbols; it did not know what a father, son, or boat were, what “sink” meant, what would happen if they sank, or anything else about the real world.
But many programs written in the 1970s and 1980s did seem to deal with real-world phenomena. This was especially true of “expert systems,” computer programs written to simulate the reasoning, and make use of the special knowledge, of experts in fields ranging from oncology to investment and from locating veins of ore to potato farming.
Typically, such programs, designed to aid problem solving, ask the person operating them questions in English, use the answers and their own stored knowledge to move through a decision-tree pattern of reasoning, close off dead ends, narrow down the search, and finally reach a conclusion to which they assign a certainty ratio (“Diagnosis: systemic lupus erythematosus, certainty .8”). By the mid-1980s, scores of such programs were in routine use in scientific laboratories, government, and industry, and before the end of the decade many hundreds were.97
Probably the oldest and best-known expert system is MYCIN, created in 1976 and improved in 1984, which can be used to detect and identify (and potentially even treat) about a hundred different kinds of bacterial infections, and announce what degree of certainty it puts on its findings. In tests against human experts, “MYCIN’s performance compared favorably with that of faculty members in the Stanford School of Medicine… [and] outperformed medical students and residents in the same school,” notes the distinguished cognitivist Robert J. Sternberg in Cognitive Psychology (2006), “[and]… had been shown to be quite effective in prescribing medication for meningitis.” internist, another expert system, diagnoses a broader range of diseases, although in doing so, it loses some precision, resulting in diagnostic powers less than that of an experienced internist.
But although these and other expert systems are intelligent in a way that banking computers, airline reservation computers, and others are not, in reality they do not know the meaning of the real-world information they deal with, not in the sense that we know. caduceus, an internal medicine consultation system, can diagnose five hundred diseases nearly as well as highly qualified clinicians, but an authoritative textbook, Building Expert Systems, long ago pointed out that it “has no understanding of the basic pathophysiological processes involved” and cannot think about medical problems outside or at the periphery of its area of expertise, even when plain common sense is all that is needed.98One medical diagnostic program failed to object when a human user asked whether amniocentesis might be useful; the patient was male and the system simply wasn’t “aware” that the question was absurd. As John Anderson has said, “The major difficulty which human experts handle well is that of understanding the context in which knowledge is to be used. A logical engine will only yield appropriate results if that context has been carefully defined.”99 But to define contexts as broadly and richly as the human mind does would require an unimaginable amount of data and programming.
The most impressive demonstrations of computer reasoning have been the chess matches in which AI programs have defeated human chess champions. In 1997 a program called Deep Blue defeated the world’s best chess player, Garry Kasparov. In large part, it did so by brute force— searching about 200 million possible moves each second (a human being does well to manage one move per second). Since then, other programs, using far less hardware and running far more slowly, but using more strategy—particularly the kinds of creative and original strategies that chess masters can make—have defeated most of their top-level human opponents. Some of the newer programs make some counterintuitive, even ridiculous-looking, moves that can prove to be highly creative.100
Among other arguments against the assertion that AI programs think, made by many psychologists and other scientists, are these:101
—AI programs of the expert system type or with broader reasoning abilities lack intuition, a crucial characteristic of human intelligence. Although computers can be excellent manipulators of symbols and can carry out complex prepackaged algorithms, they cannot act on the kinds of hunches that genuine experts can but people with only book knowledge cannot.
—AI programs have no sense of self or of their place in the world around them. This severely limits their ability to do much real-world thinking.
—They are not conscious. Even though consciousness is still proving extremely difficult to define, we experience it and they do not. They cannot, therefore, examine their own thoughts and change their minds as a result. They make choices, but these are determined by their built-in data and their programming. Computers thus have nothing that resembles free will (or, if you prefer, free choice).
—They cannot—at least not yet—think creatively except within the purely abstract realm of chess. Some programs do generate new solutions to technical problems, but these are recombinations of existing data. Others have written poetry and music and created paintings, but their products have made little dent in artistic worlds; as in Doctor Johnson’s classic remark, they are “like a dog’s walking on his hinder legs. It is not done well; but you are surprised to find it done at all.”
—Finally, they have no emotions or bodily sensations, although in human beings these profoundly influence, guide, and not infrequently misguide, thinking and deciding.
Nonetheless, both the IP metaphor and the computer have been of immense value in the investigation of human reasoning. The IP model has spawned a profusion of experiments, discoveries, and insights about those cognitive processes which take place in serial fashion. And the computer, on which IP theories can be modeled and either validated or invalidated, has been an invaluable laboratory tool.
But the shortcomings of the IP model and the limitations of AI simulations led, by the 1980s, to a second stage of the cognitive revolution: the emergence of a radically revised IP paradigm. Its central concept is that while the serial model of information processing fits some aspects of cognition, most—especially the more complex mental processes—are the result of a very different model, parallel processing.
By astonishing coincidence—or perhaps through a cross-fertilization of ideas—this accorded with then-new findings of brain research showing that in mental activities, nerve impulses do not travel a single route from one neuron to another; they proceed by the simultaneous activation of multitudes of intercommunicating circuits. The brain is not a serial processor but a massively parallel processor.
Matching these developments, computer scientists got busy devising a new kind of computer architecture in which interlocking and intercommunicating processors work in parallel, affecting one another’s operations in immensely complex ways that are more nearly analogous to those of the brain and mind than are serial computers.102 The new computer architecture is not patterned on the neuron networks of the brain, most of which are still unmapped and too complex by an astronomical degree to be copied, but it does, in its own way
, perform parallel processing.
The technical details of these three developments lie beyond the scope of this book. But their meaning and significance do not; let us see what we can make of them.
New Model
In 1908, Henri Poincaré, a French mathematician, labored for fifteen days to develop a theory of Fuchsian functions but without success. He then left to go on a geological expedition. Just as he boarded a bus, talking to a fellow traveler, the solution popped into his mind so clearly and unequivocally that he did not even interrupt his conversation to check it out. When he did so later, it proved correct.
The annals of creativity are full of such stories; they suggest that two (or possibly more) thoughts can be pursued simultaneously by the mind, one consciously, the other or others unconsciously. Anecdotes are not scientific evidence, but in the early years of the cognitive revolution several experiments on attention did suggest that the mind is not a single serial computer.
In one of the best known, conducted in 1973, subjects put on headphones after being told by the experimenters, James Lackner and Merrill Garrett, to pay attention only to what they heard with the left ear and to ignore what they heard with the right one. With the left ear they heard ambiguous sentences, such as “The officer put out the lantern to signal the attack”; simultaneously, with the right ear some heard a sentence that would clarify the ambiguous one if they were paying attention to it (“He extinguished the lantern”), while others heard an irrelevant sentence (“The Red Sox are playing a doubleheader tonight”).
Neither group could say, afterward, what they had heard with the right ear. But when asked the meaning of the ambiguous sentence, those who had heard the irrelevant sentence with the right ear were divided as to whether the ambiguous one had meant that the officer snuffed out or set out the lantern, but nearly all of those who had heard the clarifying sentence chose the snuffed out interpretation. Apparently, the clarifying sentence had been processed simultaneously and unconsciously along with the ambiguous one.103
This was one of several reasons why, during the 1970s, a number of psychologists began to hypothesize that thinking does not proceed serially. Another reason was that serial processing could not account for most of human cognitive processes; the neuron is too slow. It operates in milliseconds, so human cognitive processes that take place in a second or less would have to comprise no more than a hundred serial steps. Very few processes are that simple, and many, including perception, recall, speech production, sentence comprehension, and “matching” (pattern or face recognition), require vastly greater numbers.
By 1980 or so, a number of psychologists, information theorists, physicists, and others began developing detailed theories of how a parallel-processing system might work. The theories are extremely technical and involve high-level mathematics, symbolic logic, computer science, schema theory, and other arcana. But David Rumelhart, one of the leaders of the movement, summed up in simple language the thinking that inspired him and fifteen colleagues to develop their version, “Parallel Distributed Processing” (PDP):
Although the brain has slow components, it has very many of them. The human brain contains billions of such processing elements. Rather than organize computation with many, many serial steps, as we do with systems whose steps are very fast, the brain must deploy many, many processing elements cooperatively and in parallel to carry out its activities. These design characteristics, among others, lead, I believe, to a general organization of computing that is fundamentally different from what we are used to.104
PDP also departed radically from the computer metaphor used until then in its explanation of how information is stored. In a computer, information is retained by the states of its transistors. Each is either switched on or off (representing a 0 or a 1), and strings of 0’s and 1’s stand for numbers symbolizing information of all sorts. When the computer is running, electric current maintains these states and the information; when you turn it off, everything is lost. (Permanent storage on a disk is another matter altogether; the disk is outside the operating system, much as a written memo is outside the mind.) This cannot be the mind’s way of storing information. For one thing, a neuron is not either on or off; it adds up inputs from thousands of other neurons and, reaching a certain level of excitation, transmits an impulse to still other neurons. But it does not remain in an active state for more than a fraction of a second, so only very short-term memory is stored in the mind by neuronal states. And since memories are not lost when the brain is turned off in sleep or in the unconsciousness caused by anesthesia, it must be that longer-term storage in the brain is achieved in some other fashion.
The new view, inspired by brain research, is that knowledge is stored not in an “on or off” state of the neurons but in the connections among them formed by experience.* In the case of a machine, the connections exist among the “units” of a parallel distributed processor. As Rumelhart said,
Almost all knowledge is implicit in the structure of the device that carries out the task…It is built into the processor itself and directly determines the course of processing. It is acquired through the tuning of connections as these are used in processing, rather than formulated and stored as declarative facts.105
The new theory, accordingly, came to be known as “connectionism,” and has remained the number one buzz word of current cognitive theory.106 In an interesting reversal, cognitive psychologists no longer think of mental processes as taking place in computer-like serial fashion and their connectionist model of mental processes—based on neurological evidence—has become the guiding standard for computer design.
Rumelhart and his collaborators were not the only psychologists to imagine how connectionism in the human mind might work; a number of other connectionist models have been drawn up in recent years. But the basic concept underlies them all, namely, that the brain functions by means of exceedingly intricate and complex networks of multiple interconnections among its neurons, enabling the mind, among other things, to work both consciously and unconsciously at the same time, make decisions involving multiple variables, recognize meanings of spoken or written words, and on and on.
For our purposes, the Rumelhart et al. model can serve to exemplify the whole genre. A diagram that Rumelhart and two of his collaborators drew up for their book on PDP will make the PDP idea clear if you are willing to take a minute or two to trace it through. It is a portrait not of a bit of brain tissue but of a bit of a theoretical connectionist network:
FIGURE 43
The mind’s wiring? A hypothetical example of a connectionist network
Units 1 to 4 receive inputs from the outside world (or some other part of the network), plus feedback from the output units, 5 to 8. The connections among the units are indicated symbolically by the unnumbered disks: The bigger an open disk, the stronger the connection, the bigger a filled disk, the stronger the inhibition or interference with transmission. Thus, unit 1 does not influence unit 8, but does influence 5, 6, and 7 to varying degrees. Units 2, 3, and 4 all influence 8 in widely varying degrees, and 8 in turn feeds back to the input units, affecting 1 almost not at all, 3 and 4 rather little, and 2 strongly. All this goes on at the same time and results in an array of outputs, in contrast to the single process and single output of serial design.
Although Rumelhart and his collaborators said that “the appeal of PDP models is definitely enhanced by their physiological plausibility and neural inspiration,” the units in the diagram are not neurons and the connections are not synapses.107 The diagram represents not a physical entity but what happens; the brain’s synapses and the model’s connections operate in different ways to inhibit some connections and strengthen others. In both cases, the connections are what the system knows and how it will respond to any input.108
A simple demonstration: What word is partly obscured by inkblots in this diagram?
FIGURE 44
How can you tell what the partly obscured word is?
You probably recogni
zed instantly that the word is pen. But how did you know that? Each partly obscured letter could have been other than the one you took it to be.
The explanation (based on a similar example by Rumelhart and Jay McClelland): The vertical line in the first letter is an input into your recognition system that strongly connects to units in which P, R, and B are stored; the curved line connects to all three. On the other hand, the sight of the straight line does not connect—or, one can say, is strongly inhibited from connecting—to any unit representing rounded letters like C or O. Simultaneously, what you can see of the second letter is strongly connected to units registering F and E, but crossfeed from the first letter connects strongly to E not F, because experience has established PE, RE, and BE but not PF, RF, or BF as the beginning of an English word. And so on. Many connections, all operating at the same time and in parallel, enable you to see the word instantly as pen and not as anything else.109
On a larger scale, the connectionist model of information processing is in striking accord with other seminal findings of cognitive psychological research. Consider, for instance, what is now known about the semantic memory network in Figure 41. Each node in that network— “bird,” “canary,” and “sing,” for instance—corresponds to a connectionist module something like the entire array in the last diagram but perhaps consisting of thousands of units rather than eight.110 Imagine, if you can, enough such multithousand-unit modules to register all the knowledge stored in your mind, each with millions of connections to related modules, and… But the task is too great for imagination. The connectionist architecture of the mind is no more possible to visualize in its entirety than the structure of the universe; only theory and mathematical symbols can encompass it.