by Morton Hunt
The 1908 scale was a remarkable success. By 1914 over 250 articles and books had been published commenting on or making use of it, and by 1916 the 1908 or 1911 revisions were being used throughout much of the United States, Canada, England, Australia, New Zealand, South Africa, Germany, Switzerland, Italy, Russia, and China, and had been translated into Japanese and Turkish. The need for such a measuring device clearly had become great in industrial societies. The psychologist Henry H. Goddard, who introduced the scale to American psychologists in 1910, wrote in 1916 that it was hardly an exaggeration to say that “the world is talking of the Binet-Simon Scale.”34 And that was only the beginning.
Binet, who died in 1911 at the age of fifty-four, did not live to see his triumph, but if he had, he might have been saddened to note that his scale, widely adopted in other countries, was neither appreciated nor used in France. It came into use there only in the 1920s, when a French social worker brought it back from America. Binet himself was little esteemed in his own country until 1971, when at last a ceremony honoring him and Simon was held at the school where he had instituted experimental methods of teaching the retarded.
The Testing Mania
Nowhere was intelligence testing as swiftly and enthusiastically adopted as in the United States. And for good reason. In a country with a fluid social structure, a rapidly expanding need for workers who could master complex technological jobs, a growing underclass of the poor, delinquent, and criminal, and an influx of millions of ill-educated and seemingly semiprimitive immigrants, a scientific way of evaluating the mental capacity of individuals offered the leaders of society a way to make social order out of chaos.35
But while Binet had believed that the intelligence of mental defectives, especially those close to normal, could be increased by special training, most of the early advocates of intelligence testing in the United States took Galton’s position that heredity was the largest determinant of mental development and that the individual’s intelligence was therefore unchangeable. They saw mental measurement as a means by which society could channel its members into the kinds of schools and jobs their innate capacity fitted them for and as a diagnostic device with which to identify those individuals who should be restrained from reproducing and passing on their defectiveness.
Henry Goddard was one of the leading exponents of this view. Goddard (1865–1957), a forceful and dynamic man, had been trained at Clark University, where G. Stanley Hall (one of Wundt’s early students), a convinced hereditarian, was the head of the psychology department. Goddard absorbed the hereditarian view, and when, in 1906, he became the director of the research laboratory of the Vineland, New Jersey, Training School for the Feeble-minded, it seemed to him that he saw it amply confirmed all around him; many of the feeble-minded not only behaved, but looked, innately flawed. Goddard even hypothesized that mental defectiveness was due to a single recessive gene.36
He did recognize, however, that the children at Vineland were not all defective to the same degree, and that to determine what kind of training would be best for each one, he needed a way of measuring individual levels of mental ability. For a while he tried using Cattell’s anthropometric tests, but with no success. Then, during a trip to France, he learned about the 1908 Binet-Simon scale, recognized its merits, and immediately translated it into English, making no changes except to replace a few French cultural references with American equivalents.
Goddard was the first to use the Binet-Simon scale for mass testing; he administered it to four hundred children at the training school and two thousand children in the New Jersey public schools. His results showed a broad range of intelligence scores among the feeble-minded children and also, surprisingly, among public school pupils, an alarming number of whom tested below their age norms.37
This motivated him to begin a campaign for intelligence testing in the public schools to locate below-normal children and shunt them into special classes; he also began offering courses for teachers in the use of the Binet-Simon scale and distributed thousands of copies to colleagues across the United States. Within half a dozen years the Binet-Simon scale was being used in many public schools, where it played an important part in the decisions teachers made about the education of students. It was also in use at a number of institutions for “mental defectives,” reform schools, and juvenile and police courts, where it influenced the treatment accorded inmates and offenders.38
Goddard argued that low intelligence was a serious societal problem that had to be vigorously attacked. Idiots and imbeciles were no threat to society, he said, since they usually do not propagate their kind, but “high-grade defectives” or morons (a word Goddard invented) were very likely, he claimed, not only to become social misfits or criminals but to beget offspring who were equally likely to become antisocial. He also viewed the matter the other way around, saying that many criminals, most alcoholics and prostitutes, and “all persons who are incapable of adapting themselves to their environment and living up to the conventions of society or acting sensibly” were hereditarily mentally inferior.39
These assertions were based both on his use of the Binet-Simon scale and on his study of the descendants of a soldier in the American Revolution, one Martin Kallikak (a pseudonym), who sired a son by a feeble-minded barmaid and later married a Quaker woman and had children by her. Goddard traced Kallikak’s many hundreds of descendants by both women down the generations to the early years of the twentieth century, and reported that a majority of those on the barmaid’s side were feeble-minded, immoral, or criminal, and nearly all of those on the Quaker woman’s side were upstanding members of society.
We now know that the study was grievously flawed. Among other things, few family members were or could be tested, and most were rated as to intelligence by looks alone or secondhand reports and hearsay. Also, Goddard said that the environments in which descendants of both sides were raised were basically the same, but existing information (such as infant mortality in the two lines) clearly showed the opposite. But at the time (1912) and for many years, The Kallikak Family was taken by many psychologists and the reading public to be dramatic proof of the genetic transmission of intellectual ability—Goddard actually spoke of “good blood” and “bad blood”40—and of its social consequences.
Goddard’s Binet-Simon data and his findings about the Kallikak family led him to take a position far more severe than Galton’s: “It is perfectly clear that no feeble-minded person should ever be allowed to marry or become a parent. It is obvious that if this rule is to be carried out the intelligent part of society must enforce it.”41 In pursuit of this goal, he served as an expert witness to two national committees advocating the sterilization of “mentally defective” people, and one of which sweepingly extended the recommendation to paupers, criminals, epileptics, the insane, and the congenitally handicapped.
Legislators were impressed by Goddard’s testimony and that of other psychologists. By 1931 twenty-seven states had laws authorizing eugenic sterilizations, and thousands of mentally and socially “defective” people were sterilized during the next three decades—nearly ten thousand in California alone. By the 1960s, however, both because the compulsory sterilization of the unfit seemed akin to Nazi policies and because an environmental view of mental and social disability had become dominant, state legislatures began repealing the laws in favor of statutes authorizing sterilization of the mentally retarded on a voluntary basis.
Goddard made an equally consequential social application of the Binet-Simon scale to the immigration question. Since the turn of the century, immigrants had been pouring into the country. Many were illiterate and socially backward, raising fears that the nation was being swamped by social and mental defectives. Congress had passed a law forbidding entry to lunatics and idiots, and immigration inspectors rejected about 10 percent of the thousands arriving each day, but it was thought that many others were slipping through. In 1913 the United States commissioner of immigration asked Goddard to study the screeni
ng procedures at Ellis Island and offer his advice. For a week, Goddard and several assistants picked out immigrants whose appearance they considered suggestive of mental defectiveness and, through interpreters, gave them the Binet-Simon. Most scored in the defective range—hardly surprising, in view of their fatigue, fear, lack of education, and the difficulties of interpretation—and Goddard thereupon recommended that immigration inspectors henceforth use brief “psychological methods” based on Binet-Simon testing. In 1913 deportations of ostensibly feeble-minded immigrants rose by 350 percent, and in 1914 half again as much.42
Goddard continued his work at Ellis Island for some months in 1914; the testing of a sample of arriving immigrants showed that about four fifths of the Jews, Hungarians, Italians, and Russians were feeble-minded. Even Goddard was incredulous; he reviewed the data, tinkered with the results, and lowered the figures, but only to the 40 to 50 percent range. These findings, along with evidence offered by other psychologists of like mind, influenced Congress in its drafting of the severely restrictive immigration law of 1924, which reduced total quotas for southern and eastern Europe to less than a fifth of that for northern and western Europe.43
Despite the acceptance of Goddard’s translation of the Binet-Simon scale, Lewis M. Terman, a professor of psychology at Stanford University, saw certain flaws in it, and felt that he could correct them and make the scale more accurate. Like Goddard and many others who subscribed to the hereditarian view of intelligence, Terman believed there was a social need for such an instrument. He also saw a scientific need for it: although he was a hereditarian, he said that the relative influences of heredity and environment would not be known until perfected intelligence tests were widely used,44 and he undertook a major revision of the Binet-Simon scale, known as the Stanford-Binet scale.
Terman himself had no personal reason to believe in the inheritance of intelligence; he was the twelfth of fourteen children of an Indiana farm family, none of whose members and none of whose ancestors on either side had ever belonged to a profession or gone to college.45 But when he was ten, an itinerant book peddler, while selling Terman’s parents a book on phrenology, felt the boy’s head and proclaimed that he had unusual abilities. The incident may have given Terman his bent toward the innatist view, and his subsequent history seemed to confirm it. He was able to work his way up, despite serious financial odds, from a country school to normal school, thence to college, and finally, by means of a fellowship, to Clark University, where he earned a doctorate in psychology in 1905. By that time he was a convinced hereditarian and admirer of Galton’s.
At Stanford University he spent several years in the education department and then became head of the psychology department. In the course of a long and distinguished career he made the department a leading graduate and research center, conducted a respected long-term study of gifted children, and carried out a classic study of the psychological factors in marital happiness. But his main claim to fame, major contribution to psychology, and chief influence on American life was the Stanford-Binet scale.
Terman’s experience with the Binet-Simon scale, even in its 1911 revision, had led him to believe that it had too few tests at the upper mental levels, that many tests at both the low end and the high end were misplaced in the sequence, and that the correct procedures for giving and interpreting the test were inadequately defined. With the help of eight collaborators and many public school teachers, he tried out the old tests and forty new ones (twenty-seven of which, and nine others taken from other sources, were added to the final series) on seventeen hundred normal children, two hundred retarded and superior children, and over four hundred adults. In its final form, the Stanford-Binet scale comprised ninety tests; those applicable to children between the ages of three and five took about half an hour to administer, and those to older groups longer and longer; the adult level required from an hour to an hour and a half.46
How well children of any age did with each test was compared with how well they did on others; those tests which were too easy for children of a given age were shifted to an earlier place in the sequence, those which were too hard, to a later one. To balance the scale, additional tests were added at the lower and upper ends. The results of the testing were compared with teachers’ estimates of the same children’s intelligence by the Pearsonian correlation method; the overall correlation was .48, or moderately high, thereby validating the scale. The correlation would have been still higher had not teachers, in estimating childrens’ intelligence, sometimes failed to take into account that some of the children were either younger or older than most of their classmates.
The most valuable aspect of the revision was that the entire scale was far more thoroughly “standardized” than Binet-Simon or Goddard-Binet-Simon; that is, the scores were based on results achieved with a large standard sample of normal, retarded, and superior children and adults. On this basis, a child or adult who scored 100 was average; one who scored 130 or better was more intelligent than 99 percent of the population at large; and one who scored 70 or below was less intelligent than 99 percent of the population. Terman classified the grades of intelligence as follows:
140 and up ………. “Near” genius or genius
120–140 ………… Very superior intelligence
110–120 ………… Superior intelligence
90–110 ………… Normal or average intelligence
80–90 ……………… Dullness, rarely classifiable as feeble-mindedness
70–80 ……………… Border-line deficiency, sometimes classifiable as dullness, often as feeble-mindedness
Below 70 ………… Definite feeble-mindedness
Terman, a mild-mannered and kindly man, voiced benign hopes for the use of the new scale:
When we have learned the lessons which intelligence tests have to teach, we shall no longer blame mentally defective workmen for their industrial inefficiency, punish weak-minded children because of their inability to learn, or imprison and hang mentally defective criminals because they lacked the intelligence to appreciate the ordinary codes of social conduct.47
If the Stanford-Binet did not exactly make those sentiments a reality, neither did it, fortunately, make a reality of Terman’s vision of its use in eugenics:
It is safe to predict that in the near future intelligence tests will bring tens of thousands of… high-grade defectives under the surveillance and protection of society. This will ultimately result in curtailing the reproduction of feeble-mindedness, and in the elimination of an enormous amount of crime, pauperism, and industrial inefficiency.48
The Stanford-Binet, published in 1916, swiftly became the standard test for measuring intelligence and remained so for over two decades. It was soon being used in a number of schools, preschools, colleges, and institutions for the feeble-minded. But its influence was both broader and more profound than that; the Stanford-Binet scale (and, later, its 1937 revision) became the standard for virtually all IQ tests that followed it. What Binet, Simon, and Terman took to be the attributes making up intelligence became the model for nearly all later intelligence tests; these components included memory, language comprehension, size of vocabulary, eye-hand coordination, knowledge of familiar things, judgment, likenesses and differences, arithmetical reasoning, ability to detect absurdities, speed and richness of association of ideas, and several others.49
A subsequent test using Stanford-Binet components revolutionized the field of intelligence testing.
All versions of the Binet scale—eventually there were dozens—have to be given by a psychologist or trained technician to one person at a time. But group testing, in which subjects read questions to themselves and check off multiple-choice answers or make appropriate marks on the form, would be far quicker, simpler, and very much less expensive.
This breakthrough in mental measurement came about as a result of the entry of the United States into the First World War. Within two weeks of President Woodrow Wilson’s signing of the declaration o
f war, on April 6, 1917, the American Psychological Association appointed a committee to see what role psychology could play in the war effort. The committee reported that the most useful contribution of the profession would be the development of psychological examinations that could be quickly given to large numbers of military personnel so as to eliminate the mentally incompetent, classify individuals according to their abilities, and select the most competent for special training and responsible positions.
A group of psychologists—among them Terman, Goddard, and Robert Yerkes, a Harvard professor—met at Vineland and began planning the tests. In August, Yerkes was commissioned a major in the Army and was ordered to carry out the plans. He assembled a staff of forty psychologists who, in two months, produced the Army Alpha, a written test of intelligence, and the Army Beta, a pictorial test for the 40 percent of inductees who were functionally illiterate (the instructions for Beta were read aloud by an assistant). The widely used Alpha looks, from today’s perspective, like a curious mixture of scientific information, folk wisdom, and morality, as can be seen by these questions:
1. If plants are dying for lack of rain, you should
—water them,
—ask a florist’s advice,
—put fertilizer around them.
8. It is better to fight than to run, because
—cowards are shot,
—it is more honorable,
—if you run you may get shot in the back.
11. The cause of echoes is
—the reflection of sound waves,
—the presence of electricity in the air,
—the presence of moisture in the air.
Yerkes’ team began giving the tests in four camps, but within weeks the surgeon general decided to extend the program to the entire Army; by the time the war ended, in November 1918, more than 1.7 million men had taken the tests and some three hundred psychologists under Yerkes had graded each man and suggested a suitable military assignment for him.50 Although Yerkes’ psychological corps met resistance and noncompliance from professional officers, the tests resulted in the discharge of about eight thousand men as unfit and the assignment of about ten thousand of a low level of intelligence to labor battalions and similar services. Of greater importance, the Alpha was a factor in the selection of two thirds of the 200,000 men who became commissioned officers during the war.51