16. Novembre and Barton (2018). For an example of the development of a new method for identifying natural selection, its application to natural selection for height, and the problems introduced by population stratification, see the sequence of Berg and Coop (2014), Berg, Zhang, and Coop (2017), Sohail, Maier, Ganna et al. (2018), and Berg, Harpak, Sinnott-Armstrong et al. (2018).
17. Marth, Yu, Indap et al. (2011).
18. Telenti, Pierce, Biggs et al. (2016): 1, 5 of 6.
19. Examples of reviews of techniques and findings are Auer and Lettre (2015) and Bomba, Walter et al. (2017). Examples of studies finding a significant contribution of rare variants are Gilly, Suveges, Kuchenbaecker et al. (2018), Mancuso, Rohland, Rand et al. (2015), Fournier, Abou Saada, Hou et al. (2019), and Wainschtein, Jain, Yengo et al. (2019).
20. Gravel, Henn, Gutenkunst et al. (2011), Marth, Yu, Indap et al. (2011).
21. A few examples: Moore, Wallace, Wolfe et al. (2013) studied low-frequency (.005 to .030) SNPs and found that high proportions of bins of SNPs from DNA regions that showed indications of natural selection were significantly different across continental populations. There was also a clear ordering: Asia and Europe had the fewest significantly different bins (51 percent), compared to Africa and Europe (81 percent) and Africa and Asia (83 percent). Marth, Yu, Indap et al. (2011) studied rare variants (frequency less than .01) using the 1000 Genomes Exon Pilot database and found that “coding variants below 1 percent allele frequency show increased population-specificity and are enriched for functional variants.”
22. Lactase persistence (LP). Human infants everywhere have always been able to digest the milk sugar lactose, but for tens of thousands of years after humans left Africa that ability universally faded after weaning. The advent of cattle and goat domestication led to strong selection pressure for the ability to drink milk. A variety of explanations for the increase in fitness have been advanced (Gerbault, Liebert, Itan et al. 2011). The simple food value of milk, which provides protein, micronutrients, calcium, and carbohydrates, was surely a factor, but there are other ways in which LP might have had survival value. When crops failed, the availability of milk would have been of great survival value for adults who were lactose persistent and would have increased the mortality rate of those who tried to drink it but were not lactose persistent (by inducing dehydration through diarrhea). Because milk contains vitamin D and calcium, LP might have been especially valuable at northern latitudes where sunlight was comparatively scarce. However the benefits combined, lactose persistence spread rapidly after mutations fostering it appeared. In Eurasia, LP began to occur around 9,000 years ago and spread throughout Europe during the last 4,000 years. In Africa, LP also occurred among East African tribes that domesticated cattle around 5,000 years ago, but through a distinctive genetic route (Fan, Hansen, Lo et al. 2016). Field, Boyle, Telis et al. (2016) found strong evidence that selection pressure in Europeans has persisted into the last 2,000 years.
Sickle cell anemia. I mentioned the discovery of the genetic cause of susceptibility to sickle cell anemia in the discussion of candidate genes in chapter 8. It is an example of a gene variant that confers what is called a “heterozygote advantage”: Having one copy of an allele is a good thing (protecting against a common form of malaria); having two copies carries the good thing but also carries a bad thing (susceptibility to sickle cell anemia). If the proportion of people who carry just one copy of the target allele is enough larger than the proportion who carry two copies and/or the fitness advantage of carrying the allele is enough larger than the fitness disadvantage of carrying two copies of it, then a harmful allele under some circumstances can nonetheless spread in frequency among a population through natural selection. The net effect is that the allele is fitness enhancing.
23. Darwin thought that light skin was a result of sexual selection. Narasimhan, Rahbari, Scally et al. (2016). But instead the primary cause seems to have been that intense sunlight is damaging to some essential nutrients, especially folate, which is necessary for DNA synthesis and repair. Folate deficiency has a variety of other bad effects—complications during pregnancy and fetal abnormalities including spina bifida, and damage to spermatogenesis. Parra (2007). All of these effects of high UVR exposure have direct effects on reproductive fitness. The humans who left Africa had dark skin because they had evolved to have a high level of melanin, an enzyme that acts as a photoprotective layer, filtering UVR—what commercial sunblocks now do. Jablonsky and Chaplin (2010). Among other things, Parra (2007) had argued that a high level of exposure to ultraviolet radiation (UVR) led directly to all the harmful effects association with sunburn, including (besides pain) edema, disruption of thermoregulation, and increased risk of infection. It also causes skin cancer. But it now appears that none of those had a significant effect on reproductive fitness. Jablonsky and Chaplin (2010). So why did skin eventually lighten among all of the African emigrants who moved north? It’s not as obvious as saying “They didn’t need dark skin anymore.” Nor does skin naturally get lighter across generations in the absence of intense sunlight. The main reason that light skin was advantageous in high latitudes appears to have been that UVR is essential for the synthesis of vitamin D in the skin, and lighter skin allows greater absorption of UVR. Vitamin D deficiency is implicated in rickets in children and softening of the bones in adults, and impedes other recently discovered functions involving immunoregulation and regulation of cell differentiation and proliferation.
Along with skin lightening, populations in Europe also evolved a capacity for tanning. To make things still more complicated, it has been genetically verified that at least three independent evolutionary lines for lightening were taken. The three lines all involved different genetic and physiological mechanisms. Jablonsky and Chaplin (2010). Positive selection for skin color within western Eurasia has continued into the last 5,000 years. Wilde, Timpson, Kirsanow et al. (2013).
24. Brinkworth and Barreiro (2014): 69.
25. Brinkworth and Barreiro (2014): 69.
26. Nédélec, Sanz, Baharian et al. (2016): 666.
27. Yin, Low, Wang et al. (2015): 5 of 11.
28. Yin, Low, Wang et al. (2015): 1 of 11.
29. Bigham (2016).
30. Bigham (2016): 9.
31. Huerta-Sanchez, Jin, Asan et al. (2014).
32. Rivas, Avila, Koskela et al. (2018).
33. Cochran and Harpending (2009): 213–14.
34. Cochran and Harpending (2009): 217. Their hypothesis is that natural selection occurred “because of the unique natural-selection pressures the members of this group faced in their role as financiers in the European Middle Ages.” Their proposition is that the fitness-reducing effects of the harmful diseases were more than counterbalanced by their fitness-increasing effects on cognitive ability. It is a provocative hypothesis, but whether it is true is not the point for this discussion.
35. Lachance, Berens, Hansen et al. (2018): 16.
36. Guo, Wu, Zhu et al. (2018): 1 of 9.
37. Takeuchi, Akiyama, Matoba et al. (2018): 6 of 15.
38. Takeuchi, Akiyama, Matoba et al. (2018): 7 of 15.
39. Fumagalli, Moltke, Grarup et al. (2015).
40. Fan, Hansen et al. (2016).
41. Historical and contemporary statistics and sources on commission of homicide by sex are available at ourworldindata.org/homicides.
42. That unyielding insistence is what led geneticist David Reich to write in the pages of the New York Times, “It is important, even urgent, that we develop a candid and scientifically up-to-date way of discussing any such differences, instead of sticking our heads in the sand and being caught unprepared when they are found.” David Reich, “How Genetics Is Changing Our Understanding of ‘Race’,” New York Times, March 23, 2018.
Part III: “Class Is a Function of Privilege”
1. Herrnstein and Murray (1994).
2. Crenshaw (1989): 140.
3. Andersen and Collins (2019): 4.
4. If you
believe that adjusting for IQ is meaningless because racism completely accounts for the observed black/white (B/W) difference in IQ, then let’s walk through the logic step by step.
An initial implication would seem to be that the mean black IQ will be close to the same as the white mean in societies where blacks have been the ruling population for several decades—most sub-Saharan African countries and Haiti. That doesn’t work, because the observed means for black IQ in those countries are uniformly lower than the mean of blacks in the United States. Wicherts, Dolan, and van der Maas (2010).
If your reaction is that these results reflect poor educational systems in Africa and Haiti (in part, they certainly do) and the legacy of colonial racism, then the next step in defending your position that adjusting for IQ is meaningless is to think about why racism would affect IQ. The most obvious answer is through socioeconomic status—racism accounts for disproportionate black poverty and underrepresentation in high-prestige jobs, which in turn deleteriously affects the environments in which black children grow up and thereby their IQs.
To test that proposition, the B/W difference needs to be examined after adjusting for parental SES. This has been done frequently. The technical literature consistently shows that doing so diminishes the size of the B/W difference by about a third. I could leave it at that (two-thirds of the difference is not explained by parental SES), but it is also important to note that in most studies the size of the B/W difference expressed in standard deviations increases as parental SES rises. These statements are documented in The Bell Curve. Herrnstein and Murray (1994): 286–88. Given that I was a coauthor, it may be useful to draw on an independent and authoritative source.
In the wake of the controversy over The Bell Curve, the American Psychological Association assembled a Task Force on Intelligence consisting of 11 of the most distinguished psychometricians in the United States, chaired by Ulric Neisser. Their report, titled “Intelligence: Knowns and Unknowns,” was published in the February 1996 issue of the APA’s flagship journal, American Psychologist. It was a consensus statement with no minority dissents. In the interests of concision, I am going to quote from the report when nothing needs to be added that has emerged since it was prepared. I have omitted references embedded in the report’s text. The reference for the following quotes is Neisser, Boodoo, Bouchard et al. (1996).
Regarding SES as an explanation of the B/W difference:
Several considerations suggest that this cannot be the whole explanation. For one thing, the Black/White differential in test scores is not eliminated when groups or individuals are matched for SES. Moreover, the data reviewed in Section 4 suggest that if we exclude extreme conditions, nutrition and other biological factors that may vary with SES account for relatively little of the variance in such scores. Finally, the (relatively weak) relationship between test scores and income is much more complex than a simple SES hypothesis would suggest. The living conditions of children result in part from the accomplishments of their parents: If the skills measured by psychometric tests actually matter for those accomplishments, intelligence is affecting SES rather than the other way around. We do not know the magnitude of these various effects in various populations, but it is clear that no model in which “SES” directly determines “IQ” will do. (Neisser, Boodoo, Bouchard et al. (1996): 94).
Another obvious way to discount the value of adjusting for IQ is that the tests are biased against blacks. This is the Task Force’s statement regarding cultural bias because of language:
The language of testing is a standard form of English with which some Blacks may not be familiar; specific vocabulary items are often unfamiliar to Black children; the tests are often given by White examiners rather than by more familiar Black teachers; African Americans may not be motivated to work hard on tests that so clearly reflect White values; the time demands of some tests may be alien to Black culture. (Similar suggestions have been made in connection with the test performance of Hispanic Americans.) Many of these suggestions are plausible, and such mechanisms may play a role in particular cases. Controlled studies have shown, however, that none of them contributes substantially to the Black/White differential under discussion here. Moreover, efforts to devise reliable and valid tests that would minimize disadvantages of this kind have been unsuccessful. (Neisser, Boodoo, Bouchard et al. (1996): 93–94).
With regard to cultural bias in predictive validity, the Task Force wrote:
From an educational point of view, the chief function of mental tests is as predictors.… Intelligence tests predict school performance fairly well, at least in American schools as they are now constituted. Similarly, achievement tests are fairly good predictors of performance in college and postgraduate settings. Considered in this light, the relevant question is whether the tests have a “predictive bias” against Blacks. Such a bias would exist if African American performance on the criterion variables (school achievement, college GPA, etc.) were systematically higher than the same subjects’ test scores would predict. This is not the case. The actual regression lines (which show the mean criterion performance for individuals who got various scores on the predictor) for Blacks do not lie above those for Whites; there is even a slight tendency in the other direction. Considered as predictors of future performance, the tests do not seem to be biased against African Americans. (Neisser, Boodoo, Bouchard et al. (1996): 93).
Postdating the Task Force’s report, Fagan and Holland (2007) presented experimental evidence that score differences between black and white students were effectively eliminated when they were tested on the basis of newly learned information, and argued that the black/white difference was the result of differences in specific previous knowledge, which in turn reflected test bias. A similar strategy was also used in the Siena Reasoning Test. Goldstein (2008); Scherbaum, Hanges, Yusko et al. (2012). The problem with such tests is that answering these kinds of questions relies in part on short-term memory, a much less g-loaded cognitive skill than those captured by other test batteries. Michael McDaniel analyzed the Siena Reasoning Test data (he did not have access to the data necessary to analyze the Fagan test) and found that the “findings are consistent with the inference that the reported lower mean racial differences in the Siena Reasoning Test are due to its lower g saturation relative to other g tests. If this inference is correct, one could also infer that the apparent lower g saturation of the Siena Reasoning Test would be associated with lower validity and larger prediction errors.” McDaniel and Kepes (2014): 339. In a 2018 presentation to the Personnel Testing Council of Metropolitan Washington based on his research conducted for the U.S. Army to evaluate alternative g tests, McDaniel summarized “Ways to build a g test with low mean group differences” as “1. Use easy items in the test. 2. Use items with low g saturation in the test. 3. Reduce the reliability of the test so it measures g less well.” McDaniel (2018): 7.
Suppose you posit a broader role for bias, one that cannot be captured by assessments of language and predictive validity. Call it the “background radiation” theory of racism’s effect on IQ. This position holds that the United States is so steeped in the conditions that produce the B/W difference that it affects every performance measure, not just IQ scores. If this position is true, it is useless to look for evidence of test bias. We have no criterion measure that is independent of this culture and its history. The bias pervades everything.
If you take that position, I can’t argue you out of it with data. None can conceivably exist. But you should understand the implications of that position. The background radiation hypothesis implies that the performance yardsticks in our society are not only biased but so similar in the degree to which they distort the truth—in every type of educational institution from kindergarten through graduate school, at every level of every occupation, for every performance measure—that no differential distortion is picked up by the data. Is this plausible? Everyday experience suggests that the environment confronting blacks in different sectors of American life is
not uniformly hostile. Assuming that the background radiation hypothesis is true represents a considerably longer leap of faith than the limited assumption that racism is still a factor in American life.
5. Source: the Census Bureau’s 2018 Annual Social and Economic Supplement of the Current Population Survey, hereafter referred to as CPS-2018. The data were downloaded from cps.ipums.org. The numbers for sex differences in educational attainment refer to persons ages 25–54 after applying the CPS sample weights. The means for highest grade completed in 2018 were 14.5 for women and 14.1 for men. Women have had a higher mean every year from 1997 through 2018. The means for the percentages of persons with college degrees in 2018 were 41.0 percent for women and 35.7 percent for men. Women have had a higher percentage than men every year from 2003 through 2018.
The sources for analyses that control for IQ are the 1979 and 1997 cohorts of the National Longitudinal Survey of Youth, hereafter referred to as NLSY79 and NLSY97. The surveys are sponsored by the Bureau of Labor Statistics of the U.S. Department of Labor. Data were downloaded from the NLS Investigator (nlsinfo.org/investigator/pages/search.jsp). They represent two of the handful of large American datasets that have a full-scale measure of cognitive ability and a long follow-up period, combined with detailed data on education, marital status, fertility, income, and labor market experience. The application of sample weights permits nationally representative estimates.
Human Diversity Page 54