Human Diversity

Page 58

by Charles Murray

38. Other examples are Jencks (1979); Korenman and Winship (2000); Firkowska-Mankiewicz (2002); and Richards and Sacker (2003).

39. The Project Talent, Aberdeen Birth Cohort, and NCDP databases were part of the table here about IQ and personality factors.

NLSY79. The analysis for educational attainment (n = 8,693) regressed highest grade completed on AFQT scores and the measure of parental SES used in The Bell Curve, which combined measures of parental education, family income, and occupational status. Herrnstein and Murray (1994): Appendix 2. The sample was limited to those 35 or older at the time of their most recent information about highest grade completed. The analysis for income (n = 5,801) regressed the log of earned income in 2005 on the same independent variables, limited to those with reported earned incomes greater than zero. The logged value was set to 6.0 for incomes of $1 to $999. The data reported in the table are standardized regression coefficients.

NLSY97. The analysis for educational attainment (n = 4,078) regressed highest grade completed on AFQT scores and an index of parental SES that combined the resident parents’ educational attainment and family income (occupational data for the parents were not collected for the NLSY97). The sample was limited to those who were interviewed in the 2015 follow-up when 97 percent of them were ages 31–35 and the rest were 30 or 36. The analysis for income (n = 3,025) regressed the log of earned income in 2005 on the same independent variables, limited to those with reported earned incomes greater than zero. The logged value was set to 6.0 for incomes of $1 to $999. The data reported in the table are standardized regression coefficients.

40. The one instance in which childhood SES had more effect on educational attainment than childhood IQ was for the cohort born in 1932 in Scotland, whereas the relative importance of IQ is greatest in the two most recent cohorts, both from the United States, which suggests two obvious explanations: The influence of SES on schooling diminishes over time as educational opportunities for all have expanded; and SES is less important in the United States than in the UK, where the class system has historically been a much bigger deal. But the Aberdeen data are conspicuously inconsistent with those explanations, so skepticism of this interpretation is appropriate.

The one analysis in which childhood SES had more effect on adult occupations or income than childhood IQ was for the Talent database, which found small effects of either IQ or childhood SES on income, but the effect of childhood SES was slightly larger. Both the small size of the effects and the comparatively small role of IQ are probably explained by the follow-up age for the Talent sample: just 11 years after high school graduation, usually meaning 29–30 years old, years when many who will eventually become high earners (e.g., physicians or attorneys) are either still in school or in entry-level positions and many who are blue-collar workers in the skilled trades are at the peak of their earning power.

13: Constraints and Potentials

1. Moffitt, Caspi, and Rutter (2005).

2. Krapohl, Rimfeld, Shakeshaft et al. (2014): Table S7.

3. Krapohl, Rimfeld, Shakeshaft et al. (2014): 4 of 6.

4. Krapohl, Rimfeld, Shakeshaft et al. (2014): 3 of 6. A 2019 study using the Twins Early Development Study (first author was Saskia Selzam) argued that it had found additional evidence of a major role for passive rGE. The authors used a polygenic risk score (described in chapter 14) derived exclusively from genetic variants evaluated in a large sample of unrelated people. These polygenic risk scores had been shown to predict both cognitive ability and non-cognitive traits such as height and BMI. Using a sample of fraternal twins, they asked whether the same polygenic risk scores predicted cognitive and noncognitive traits equally well within families and between families. For the noncognitive traits, the answer was yes. For the cognitive traits, the within-family prediction was significantly lower than the between-family prediction. Then the authors examined the effect on the findings of including SES (assessed on age of mother and parental education) in the analysis. They found that the entire difference in prediction between families and within families was accounted for by SES and concluded that the explanation was passive rGE. Selzam, Ritchie, Pingault et al. (2019).

The difficulty in interpreting this conclusion is the generic one I discussed in chapter 11: SES is partly a function of parental personality traits and cognitive ability. But SES also has a genuinely environmental component. Some degree of passive rGE is surely at work; it’s just hard to know how much. It would be interesting to replicate the analysis for height (one of the traits included in the study) after controlling for the midpoint of parental height to get a sense of how much the predictive validity for a highly heritable trait is reduced.

5. E.g., Horn, Loehlin, and Willerman (1979); Plomin and DeFries (1983).

6. Scarr-Salapatek (1971). Scarr (she later dropped Salapatek) used the following logic: (1) Heritability is a ratio of variance explained by genes divided by total variance; (2) children in disadvantaged families usually have lower mean IQs than children from other families; (3) if those differences are entirely environmental in origin, then something in the environment is depressing the scores of disadvantaged children. If that is the case, then the environment is explaining more of the variance in IQ for disadvantaged children than for other children. If the environment is explaining more, then necessarily genes (the numerator in the heritability ratio) are explaining less, and it necessarily follows that the estimate of heritability for disadvantaged children will be lower than for other children. Scarr found evidence for that hypothesis in a sample of Philadelphia twins.

7. Fischbein (1980) found that among Swedish 12-year-old twins, heritability for the children of employers on a verbal test was much higher (.76) than among children of manual workers (.21). Rowe, Jacobson, and Van den Oord (1999) analyzed scores on a measure of verbal IQ for 523 twin pairs and reached strikingly similar results. Using parental education as the moderating variable, heritability for the highly educated families was .72 compared to Fischbein’s .76. Among poorly educated families, heritability was .26 compared to Fischbein’s .21. The role of the shared environment, .23 in both studies, remained small even for children from low-SES families.

8. Turkheimer, Haley, Waldron et al. (2003).

9. Specifically, the SES scores were based on the 100-point system from Myrianthopoulos and French (1968).

10. The values for h2 and c2 are estimates based on Fig. 3 in Turkheimer, Haley, Waldron et al. (2003).

11. One widely-used social science method is multivariate regression analysis, described in Appendix 1. A multivariate regression equation has a coefficient associated for each independent variable (recall that “independent” causes “dependent”) that represents the effect of each independent variable after taking the other independent variables into account. For example, if the dependent variable is years of education and one of the independent variables is family income measured in thousands of dollars, then a coefficient of .003 would mean that an income of $100,000 is associated with an increase in years of education equaling 100 × .003, or .3 years of education. These coefficients are the main effects. To test for an interaction effect in regression analysis, the two variables in question are multiplied by each other. The coefficient associated with that combined variable is the interaction effect.

A common error when people talk about interaction effects informally is to conflate an interaction effect with an additive effect. Suppose that two independent variables each have a sizable main effect. In a regression analysis, both of those sizable effects are combined. The fact that the interaction term is small or insignificant doesn’t diminish the importance of the additive main effects. Thus family income and genetic endowment for IQ could each have important effects on years of education; failure to find an interaction simply means that their added main effects aren’t significantly augmented by an interaction between the two.

Two other comments about interaction effects: First, big interaction effects are rare. The main effects usuall
y extract most of the juice from the two independent variables under examination. Second, unless the interaction effect is really big or the sample size is really big, it is unlikely that a significant interaction effect will be found. Most of the analyses that include interaction terms are drastically underpowered—reliably estimating an interaction effect requires a sample about 16 times as large as the sample needed to estimate a main effect. Gelman (2018). If you want to get into the statistical subtleties of interaction effects and your math is up to speed, a basic resource is Cox (1984).

12. The phrase “interpretable as replications” is borrowed from Kirkpatrick, McGue, and Iacono (2015), which selected five studies between 2003 and 2015 that qualified: Harden, Turkheimer, and Loehlin (2007); van der Sluis, Willemsen, de Geus et al. (2008); Grant, Kremen, Jacobson et al. (2010); Hanscombe, Trzaskowski, Haworth et al. (2012); and Bates, Lewis, and Weiss (2013). The results from three other studies shown in Fig. 6.02 are the Kirkpatrick study itself; Bates, Hansell, Martin et al. (2016), which postdated the Kirkpatrick study; Tucker-Drob, Rhemtulla, Harden et al. (2011); Rhemtulla and Tucker-Drob (2012); Spengler, Gottschling, Hahn et al. (2018); and the U.S. sample in Tucker-Drob and Bates (2015). The last two were not selected by Kirkpatrick, McGue, and Iacono (2015). I surmise that they were not selected because they dealt with very young samples (tested at ages two and four respectively), when mental tests are much less reliable than with older children.

A meta-analysis of the literature in 2015, Tucker-Drob and Bates (2015), included other studies that were too different in their measures of cognitive ability and/or SES to qualify as “interpretable as replications.” These additional studies and some summary comments are given below:

Asbury, Wachs, and Plomin (2005). This article used a British twins sample, employing a detailed characterization of the shared environment, of which SES was just one component. The others were family chaos, maternal depression, harsh parental discipline, negative parental feelings, maternal medical problems in pregnancy, twin medical risk, instructive parent-child communication, informal parent-child communication, and educational toys. The study had separate cognitive measures for verbal and nonverbal IQ. With regard to SES, the authors (who included Robert Plomin) wrote, “Previous G×E research in the field of cognitive development has focused almost exclusively on SES. In the case of the current research it could be argued that interactions between SES and cognitive abilities do not exist. However, what our data actually suggest is that only very low SES has an effect that, in the case of verbal ability, is to raise group heritability.” (p. 657). In other words, the interaction of genes and SES was in the opposite direction of the one observed in Turkheimer, Haley, Waldron et al. (2003) and other studies. The authors speculated that this might reflect British-U.S. differences in culture and child-rearing practices.

Bartels, van Beijsterveldt, and Boomsma (2009). The primary purpose of this study was to assess the effects of breastfeeding. The raw data were reanalyzed by Tucker-Drob, Rhemtulla, Harden et al. (2011), and failed to show a G×E interaction.

Jacobson and Vasilopoulos (2012).

Soden-Hensler (2012) is an unpublished PhD dissertation. Tucker-Drob and Bates (2015) included it in their meta-analysis and described it as one of the studies that failed to find a G × SES effect.

13. van der Sluis, Willemsen, de Geus et al. (2008); Hanscombe, Trzaskowski, Haworth et al. (2012); Grant, Kremen, Jacobson et al. (2010); Bates, Hansell, Martin et al. (2016).

14. Four of the sources for the table contained graphs plotting c2 against percentiles of SES. In two cases, Rhemtulla and Tucker-Drob (2012) and Tucker-Drob, Rhemtulla, Harden et al. (2011), the metric for the horizontal axis was standard deviations instead of percentiles, and the metric for the vertical axis was variance instead of c2, which I converted to their percentile and c2 equivalents. The numbers used for the table for those studies and for Tucker-Drob and Bates (2015) were retrieved from the figures in those articles, which means they are accurate to within a few percentiles. For Harden, Turkheimer, and Loehlin (2007), the figures are based on parental income (the interaction effect for the parental education representation of SES was not significant). For Rhemtulla and Tucker-Drob (2012), the figures are based on the mathematics score (the interaction effect for the reading score was not significant).

15. Tucker-Drob and Bates (2015).

16. Tucker-Drob and Bates (2015): 9.

17. A major reason to do a meta-analysis is that sampling variability is often the source of inconsistent results across studies. Therefore it is intrinsically dangerous to focus on the studies that found a major effect and discount the others: There’s too much chance of finding what you want to find rather than what’s true. With that in mind, I will mention some interesting aspects of the three studies that found a large substantive effect.

All of them dealt with very young children. The Rhemtulla and Tucker-Drob studies used the same twins dataset, with the Tucker-Drob study measuring cognitive ability at ages 12 and 24 months while the Rhemtulla study measured it at 4 years of age. The other study that found a large substantive effect, the original Turkheimer study, used measures of children at age seven. The neuroscience literature about the rapid development of children’s brains during the first years of life is consistent with the findings of these three studies. Consider the ages in the other studies that showed small interaction effects or no interaction effects. The youngest age group was 11 (one of the cohorts in the Kirkpatrick study), and the rest were adolescents or adults. The broader finding that the heritability of IQ increases with age is also consistent with the failure of studies to find large interaction effects among subjects who are adolescents or older. I should add, however, that there are problems with this inference. The Kirkpatrick study and the meta-analysis by Tucker-Drob and Bates both tested for an age-related trend and found none.

Another issue was suggested by the authors of the Kirkpatrick study, who noted that the samples for all of the attempted replications were overwhelmingly white, while the original Turkheimer study consisted of 54% blacks and 43% whites. “[P]erhaps low SES must be combined with membership in a disadvantaged minority group,” they speculated, “whose place in and experience of American society is unique due to the historical legacy of slavery.” Kirkpatrick, McGue, and Iacono (2015): 209. It’s worth noting in this regard that the sample used by the Rhemtulla and Tucker-Drob studies included 16 percent blacks, a much higher proportion than in any of the other studies.

The Turkheimer sample was also unique in that it included a far higher proportion of severely impoverished families than the other samples. The median years of education for the head of household was less than 11. The median occupation was “service worker.” Twenty-five percent were classified as “laborers.” Twenty-five percent were below the poverty line. If the most powerful interaction between genes and SES occurs at the very bottom of the distribution, the Turkheimer study had the only sample that could identify it.

I offer these as interesting possibilities. What we need at this point is a study with a large enough and varied enough sample. Simulations conducted in the course of the Tucker-Drob meta-analysis indicated that an adequately powered study would require a minimum of 3,300 twin pairs. Comparing the results from the initial Turkheimer study with subsequent results, the Tucker-Drob and Bates meta-analysis concluded: “It therefore appears that the inconsistency of previous U.S. studies to replicate Gene × SES effects on intelligence may have stemmed from low power associated with overly optimistic expectations regarding the magnitude of the true interaction effect.” Tucker-Drob and Bates (2015): 146.

18. Duncan and Magnuson (2013): 114.

19. Duncan and Magnuson (2013): 114.

20. Duncan and Magnuson (2013): 120.

21. U.S. Department of Health and Human Services (2010).

22. Duncan and Magnuson (2013): 117.

23. Duncan and Magnuson (2013): 119.

24. The paragraph continues:

&n
bsp; In general, a finding of meaningful long-term outcomes of an early childhood intervention is more likely when the program is old, or small, or a multi-year intervention, and evaluated with something other than a well-implemented RCT [random controlled trial]. In contrast, as the program being evaluated becomes closer to universal pre-K for four-year-olds and the evaluation design is an RCT, the outcomes beyond the pre-K year diminish to nothing. I conclude that the best available evidence raises serious doubts that a large public investment in the expansion of pre-K for four-year-olds will have the long-term effects that advocates tout. (Whitehurst (2013): 8–9).

25. Phillips, Lipsey, Dodge et al. (2017): 12.

26. Branden (1969).

27. Mecca, Smelser, and Vasconcellos (1989).

28. Baumeister, Campbell, Kreuger et al. (2003). See also Twenge (2006): chapter 2.

29. Steele and Aronson (1995).

30. Sackett, Hardison, and Cullen (2004).

31. Sackett, Hardison, and Cullen (2004).

32. The original article by Steele and Aronson statistically adjusted the results from the experimental test (using items from the Graduate Record Examination) based on the participants’ SAT scores when they had entered college. This often happens in studies of stereotype threat. It intuitively seems like a reasonable thing to do. In reality, it introduces a serious statistical confound. Readers who already know some statistics can get the details in Miller and Chapman (2001). Stoet and Geary (2012) explain it this way: “An important assumption of a covariate analysis is that the groups do not differ on the covariate. But that group difference is exactly what stereotype threat theory tries to explain! This is an irreconcilable difference between the theory and the statistical assumptions underlying covariate analysis.” (p. 96). The practice of adjusting scores has diminished in recent years as word about the confound has spread, but it creates problems in interpreting much of the literature. For example, the Stoet and Deary meta-analysis of gender-based stereotype threat literature as of 2012 reported two effect sizes. The effect size for the studies that adjusted for prior test scores was a sizable –0.61. The effect size for studies that didn’t adjust for prior test scores was just –0.17. Stoet and Geary (2012): 97.

‹ Prev Next ›