Human Diversity
Page 35
In 2014, Turkheimer pulled together strands he had been writing about for years into a formal statement of what amounts to a fourth law of behavior genetics to add to the first three I introduced in chapter 11. He calls it the “Phenotypic Null Hypothesis for the Genetics of Personality.” It goes like this: “All traits are heritable, and the multivariate structure of the biometric components of behavior does not differ from the phenotypic structure.”22 He subsequently puts the central idea more simply: A phenotypic trait can be heritable without having a genetic mechanism.
To introduce what he means, Turkheimer draws a contrast between Huntington’s disease and divorce. If we observe a person exhibiting the symptoms of Huntington’s disease, we don’t go looking for sociological explanations. Researchers have established an explanation at the genetic level that is theoretically sound and has been verified by test. Causation is known.
Suppose instead we observe a person who is getting a divorce. Marital status is highly heritable—72 percent in one large-sample twin study.23 The heritability of divorce specifically has been estimated at around 50 percent.24 Because divorce is heritable, we can be sure that a GWAS will identify a large number of SNPs that are significantly associated with divorce. But what have we really learned?
Suppose, for example, that some of the SNPs are related to the personality trait “irritability.” Isn’t that a plausible causal link to divorce? It could be, for some fraction of divorces. But we can’t be sure of even that. Pervasive pleiotropy probably means that the SNPs related to irritability are also related to a number of other traits that are just as plausibly a cause of divorce—or, conversely, might be related to traits that would more plausibly be related to resistance to divorce. Omnigenetics and pleiotropy both work to create a causal map so sprawling and indeterminate that it is reasonable to conclude that GWAS has taught us nothing new about the causes of divorce and that finding more SNPs in more studies won’t teach us anything important. “The heritability of marriage is a by-product of the universal, nonspecific, genetic pull on everything, not an indication that divorce is a biological process awaiting genetic analysis,” Turkheimer writes. “Marriage and divorce are heritable, but they do not have a specific genetic etiology.”25
Turkheimer is not alone in making this point. Geneticists Marcus Feldman and Sohini Ramachandran, who share his position, put it this way:
We must start from recognition that all complex human traits result from a combination of causes. If these causes interact, it is impossible to assign quantitative values to the fraction of a trait due to each, just as we cannot say how much of the area of a rectangle is due, separately, to each of its two dimensions. Thus, in the analyses of complex human phenotypes, such as those described above, we cannot actually find “the relative importance of genes and environment in the determination of phenotype.”26
It is important to emphasize that Feldman, Turkheimer, and like-minded colleagues are not merely repeating Graham Coop’s cautions about how many complications remain unresolved. They aren’t just saying that it’s early days yet and that we shouldn’t get ahead of the data. They are saying that when it comes to complex traits, the GWA enterprise is futile. Turkheimer again: “Causal explanations of complex differences among humans are therefore not going to be found in individual genes or environments any more than explanations of plate tectonics can be found in the chemical composition of individual rocks.”27
Predictions
On some purely technical issues, the Plomin and Turkheimer schools are not in conflict. Plomin does not argue that polygenic scores are causal as Turkheimer defines it. On the contrary, he acknowledges the disconnect: “The correlation between a polygenic score and a psychological trait does not tell us about the brain, behavioral or environmental pathways by which the polygenic score affects the trait.”28 For his part, Turkheimer does not dispute the existence of the correlations between polygenic scores and phenotypic traits that Plomin describes.
Yet these two schools nonetheless represent radically different understandings of where genomics and neuroscience are going to take us. The great debate for which they are exemplars is going to continue, informed by new developments in analytic methods and results from the huge new genomic databases that are coming online. I have speculative opinions about how the debate will go that I will reserve for the final chapter. Here, I confine myself to some consequences that I think are close to inevitable.
I should begin by stating my own assessment of the great debate, because it undoubtedly affects my predictions: In my field, applied social science, predictive validity trumps causal pathways. The Turkheimer position about our ignorance of causal pathways is certainly correct now and may be correct for decades to come. But applied social science has never been about causal pathways (until now, it’s never been an option) and perhaps never will be. It’s about explaining enough variance to make useful probabilistic statements.
Regarding the current limitations on predictive validity and the limited ways in which the genomic analyses add anything to what we already know from twins studies, I again think the Turkheimer position is correct about where we stand now. If you want to know a six-year-old’s cognitive ability, an IQ score is still much more accurate than a polygenic score. If you want to know the heritability of a trait, polygenic scores still don’t tell us much that we don’t already know from twin studies. But we’re talking about a field that sees methodological advances virtually every month. I think the application of genomic data to social science questions is roughly where aviation was in 1908. Eric Turkheimer thinks the Wright Flyer design has unfixable performance limits (and it does). Robert Plomin foresees the DC-3 (and it’s coming).
Polygenic Scores Will Be Useful No Matter What and Will Therefore Be Used
By the end of the 2020s, it will be widely accepted that quantitative studies of social behavior that don’t use polygenic scores usually aren’t worth reading. More formally, it will be widely accepted that the predictive validity of polygenic scores gives us useful information about causes even though we still don’t understand the causal pathways. It’s not an unusual situation in science, including the hard sciences. Look at the discovery of laws in physics through the nineteenth century that were validated solely by their predictive validity.
I will use a specific example to illustrate the situation facing applied social science. Suppose it’s 2030 and researchers are exploring causes of juvenile crime. In addition to the standard predictors as of 2020 (e.g., parental SES, education, IQ), researchers have access to polygenic scores for various aspects of criminality. They analyze how the phenotypic measures interact with the polygenic measures as predictors of criminal behavior. In light of the Turkheimer school’s objections, can the researchers be sure that the results for the polygenic scores are legitimately interpreted as causal? No. But in an Occam’s-razor sense, the results of the analyses will make alternative hypotheses more or less plausible and, as importantly, generate ideas for the next round of analyses that will incrementally clarify what’s going on. In 2030, when large databases with genomic information are easily available, I predict it will be akin to professional malpractice to conduct an analysis of social behavior that does not include genomic information. In any case, few quantitative social scientists are going to write such analyses because they won’t get past peer review. The question, “Why didn’t you take genetics into account?” will be universal and will have no good answer.
Broad Swaths of Social Science Will Be Affected
As I write, none of the social sciences have come to terms with genetics. My second prediction is that by 2030 the holdouts will be confined to isolated pockets. The impact of the genomic revolution will have importantly affected all of the traditional social science disciplines.
“Affect” can mean several things. The most important will be the role of genomics in creating novel research strategies that wouldn’t have occurred to a social scientist in the pre-genomics era. In the study
of genetic effects, twin studies are confining. They require large, hard-to-assemble, expensive samples of twins. The Falconer equations are a blunt tool that enables us to apportion roles to genes, shared environment, and everything else, thereby answering the “how much” question. The new techniques will open up new ways to explore the “how” questions. I’ve focused on polygenic scores, but a variety of analytic tools are being developed—for example, Genome-Wide Complex Trait Analysis (GCTA).29 Just as the advent of the university computer did in the 1960s and 1970s, the advent of cheap genomic information will generate new classes of studies that cannot be anticipated. Comparing the eventual power and flexibility of genomic analyses with the ACE model is akin to comparing the power and flexibility of multiple regression analysis with the analysis of a 2×2 contingency table.
With regard to the existing classes of studies, cheap genomic information will also broadly affect studies that analyze a personality trait, ability, or social behavior as it varies by sex, ethnicity, or class. The degree to which such studies are woven into research agendas varies by discipline.
Psychology. The genomics revolution will affect just about everything in psychology that involves the analysis of quantitative data. Psychology is about understanding human personality, emotions, cognitive abilities, and behavior. All of those topics include genetic sources. Plomin’s description of the possibilities that I summarized earlier conveys the breadth of the potential effects on both clinical practice and research.
Anthropology. Two of anthropology’s subfields, archaeology and physical anthropology, deal in topics that can obviously be informed by ancient DNA, and there’s no reason to think scholars in those fields won’t take advantage of it. The other two subfields, cultural anthropology and linguistic anthropology, should be as dramatically affected by genomic information as psychology will be, but they are now a battleground between scholars who see their discipline as a science and those who see it as a hybrid of investigation and social justice advocacy.[30] I assume that genomic information will be incorporated to some degree into these latter two subfields, but it is not clear to what extent.
Sociology. Some corners of sociology involve empirical topics that won’t be affected, but they are the exception. To give you an idea, consider the 49 articles that were published in America’s most prestigious sociological journal, the American Sociological Review, in 2018 and the first issue of 2019. Of the 39 articles that presented either survey data or quantitative experimental results, 33 were on topics for which polygenic scores would be directly relevant. In almost half (18 of the 39), the major topic of the article directly involved sex, ethnicity, or class.[31]
Economics and political science. The role of psychological factors in economics goes back to Adam Smith’s Theory of Moral Sentiments. The work of Daniel Kahneman, Amos Tversky, and Paul Slovic on decision making under conditions of uncertainty and, more recently, the work of Cass Sunstein and Richard Thaler on “nudge” theory, are both rich fields of study that will be informed by genomic data.32 They are only part of the growing field of behavioral economics. Similarly, questions about how humans act as political agents are at the core of political science. Genomic information is just as relevant to voting decisions as it is to economic decisions. The finding from twin studies that political and ideological views are substantially heritable opens up another set of possibilities.
Social policy. Perhaps the most visible impact of the genomics revolution will be found in public policy analysis. This prediction obviously includes almost any issue involving education, whether pre-K, K–12, or higher education, but it also includes welfare policy, criminality and criminal justice, foster care and adoption, marriage and family, poverty and unemployment—you name it. If it’s about social policy, it’s almost certainly about topics that genomic data will inform.
We already have one specific example as I write. An international team led by Kathryn Paige Harden and Benjamin Domingue used polygenic scores as a “molecular tracer” to explore how the flow of students through the math pipeline in secondary schools varied in socioeconomically advantaged and disadvantaged schools. Among other things, the analysis revealed that advantaged schools did a better job than disadvantaged schools of getting students with high polygenic scores into advanced math classes and of buffering students with low polygenic scores from dropping out of math. It also revealed that many students with exceptional polygenic scores were unlikely to take the most advanced math classes.33 If these findings were to be replicated and elaborated, they would have direct implications for better education policy. It’s just the beginning.
Some Basics About the Role of the Environment Will Be Better Understood Soon
It will be a long time before the details are fully understood, but the introduction of genomic data will answer some of the most basic questions about the respective roles of genes and environment quickly, for two reasons.
First, genomic data can answer questions about genetic nurture (discussed in chapter 13) that twin studies cannot. In twin studies, the shared environment is the same for both twins, which raises difficult technical problems when there is no variation around the family mean (for example, as in the case of divorce, which is by definition completely shared by both MZ and DZ twins).34 Analyses using polygenic scores or GCTA are not constrained to twins and thereby escape that problem.
The broader advantage of genomic analyses in this regard is that the complexities of genetic nurture can be unraveled. “Although twin studies have reported for decades that most environments are nearly as heritable as behaviors, this work has been limited to twin-specific environments,” write Maciej Trzaskowski and Robert Plomin. “GCTA opens up the possibility of investigating genetic influence on family-, neighborhood-, or even country-wide environmental measures that cannot be studied using the twin design because they are shared in common by members of a twin pair.”35 The same is true of analyses using polygenic scores.
Second, genomic analyses using polygenic scores give us a usable baseline measure of genetic potential. As matters stand, every measure of genetic potential that we use, whether from cognitive tests or personality inventories, is contaminated by potential environmental effects, and the contamination is rightly feared to be worst for people who have come from the most disadvantaged environments. Correlations between polygenic scores and phenotypes cannot be explained by backward causation, and that alone is enough to give us important leverage, despite all the complications.36
Eric Turkheimer has used an analogy that illustrates what I mean, comparing polygenic scores to a pile of raw building materials. Let’s say that you have many such piles, each of which will be used to construct a building. If you carefully examine the components of different piles, you can determine similarities among them—the buildings’ starting places. “That similarity in starting place winds up being correlated with how similar the eventual buildings are,” Turkheimer writes. “So pile-similarity is correlated with similarity in how the buildings are used, or what color they are or how big they are, or whatever. These correlations aren’t enormous, but they are striking, often in the range of .4–.6. What’s more, it turns out that occasionally, there are identical piles of materials, and although these identical piles don’t produce identical buildings, the buildings they produce are damn similar, often in the range of .7–.9. This is the heritability of building type.”37
Turkheimer’s point is that an individual’s genetic potential can lead to a widely dispersed range of phenotypes, which is unquestionably true. My point is that the piles are there at the beginning, constrain the range of possibilities, and are causal in just one direction. By the same token, a polygenic score for IQ or any other trait is causally antecedent, and that makes an enormous difference in the research questions we can answer confidently.
To see how dramatically this will change matters, recall from chapter 12 the vexed question of a G×E interaction between childhood SES and the heritability of IQ. The reason that vexe
d question is so important is that low heritability for disadvantaged children at young ages could mean an opening for interventions to have major effects. One reason the results have been so equivocal is that measures of IQ before the age of six are so unreliable. A reasonably good polygenic score for IQ fixes that.
Suppose that polygenic scores of children from disadvantaged backgrounds show that their IQ scores as adolescents average 10 points lower than their polygenic scores would have led us to expect. Confident new knowledge of that kind will energize the search for effective interventions in ways that we can scarcely imagine. Conversely, suppose it is found that the relationship of polygenic scores to phenotypic IQ scores in adolescence is about the same regardless of the childhood environment. I realize that many people dread such an outcome. In fact, that too will provide an incentive: to redirect our attention to fostering human flourishing for people with a wide range of ineradicable inequalities in gifts—a topic I take up in the concluding chapter. The most likely scenario is that the results will be less dramatic in either direction but will nonetheless teach us much about untapped potential.
The example generalizes to a wide variety of topics in which the underlying question is the extent to which socioeconomic or cultural disadvantage has affected the realization of a person’s potential. I will not spin out all the collateral analyses that could be done or describe how the analytic complications could be dealt with. I do not expect that such analyses will be free of controversy. Rather, I am asserting that many such analyses are technically feasible, will be conducted within the relatively near future, and will offer powerful tests of questions that have been argued for decades.