If they are causal, causality could work in one of two ways, sometimes called “vertical” and “horizontal” pleiotropy. An example of vertical pleiotropy is a variant that increases LDL cholesterol (bad cholesterol) and also shows an association with the incidence of heart attack—the first causal relationship is direct, the second is downstream from the first.9 Horizontal pleiotropy occurs when a variant has direct causal effects on traits that are apparently not causally related to each other—for example, when a variant seems to have an effect on both LDL cholesterol levels and schizophrenia. A 2018 analysis limited to horizontal pleiotropy concluded that “horizontal pleiotropy is pervasive and widely distributed across the genome” and that “there are thousands of loci that exhibit extreme levels of horizontal pleiotropy.”10
Combining these and other discoveries, the task of translating the raw material I described in chapter 9 into statements about causation is daunting. Geneticist Graham Coop devised a vivid thought experiment that enumerates the difficulties.
Coop asks us to imagine that a genome-wide analysis using the UK Biobank has revealed the British to have more alleles that are associated with tea consumption than the French have. He imagines a protagonist named Bob who concludes that the difference between the French and the British in their preference for tea is in part genetic. Bob is judicious. “Bob would assure us that these alleles are polymorphic in both countries, and that both environment and culture play a role. He would further reassure us that there’ll be an overlapping distribution of tea drinking preferences in both countries, so he’s not saying that all British people drink more tea for genetic reasons. He’ll tell us he’s simply interested in showing that the average difference in tea consumption is partly genetic.”11
Coop then turns to the difficulties that Bob has in drawing even that modest causal inference. He begins with a core point: Genome-wide analyses “do not point to specific alleles FOR tea preferences, only to alleles that happen to be associated with tea preference in the current set of environments experienced by people in the UK Biobank.” Coop does not argue that nothing causally informative can come from GWAS. Some of the tea-drinking SNPs may be enriched near olfactory receptors. Some may be associated with caffeine sensitivity. These are interesting from a causal point of view. But daunting problems stand between these isolated findings and the conclusion that the differential preference for tea among British and French is partly genetic in origin.
First, there are G×E (gene × environment) interactions to contend with. Maybe people who care about their weight are drawn to tea instead of soft drinks because tea has fewer calories. It is found that alleles correlated with body mass index are also correlated with tea preference. But that won’t necessarily permit causal inferences at a national level. Perhaps, for example, what counts is not absolute BMI but one’s relative BMI within a country, and the distributions of BMI in the UK and France are different.
A second problem is technical. Sometimes the SNP that shows up in a genome-wide analysis is the one that actually does the work. However, it is often a “tag” SNP that is physically near the functional SNP in the genome but doesn’t actually do the work.
In comparing populations, this isn’t a big problem if the correlations between the functional SNP and the tag SNP are the same in two populations. But recall from chapter 9 the problems of population stratification. Coop describes how they might affect the tea-drinking analysis. Pretend that British and French ancestral populations have been geographically separated for a long time. The correlation between functional and tag SNPs in one population is likely to have become different from the comparable correlation in the other population if only because of genetic drift and recombination. Let’s say that the allele frequencies of the functional SNP in the British and the French are the same as they’ve always been, but the correlation between the tag SNP and the functional SNP in the British is .90 while the comparable correlation among the French has drifted down to .70. If, unbeknownst to us, our comparison of the two countries is based on the tag SNP, we will wrongly fail to give a bump to French tea-drinking preferences as often as we would if we were working with the functional SNP. Put technically, the predictive validity of the analysis will be lower for the French than it is for the British.
A variant on this problem is assortative mating. Suppose that people who are heavy tea drinkers tend to mate with tall people. Over a few generations, height-increasing alleles will be statistically associated with tea drinking even if there is no causal link. This decreases the predictive validity of that population’s genetic score for tea drinking relative to a population that does not falsely include height-increasing alleles in its genetic score.
Coop also discusses another topic that I raised in chapter 9: A great deal of human variation is concentrated in rare variants that the ordinary GWA won’t pick up, and these rare variants are commonly private to a single ancestral population. Strong conclusions about between-population comparisons will have to wait until we have far more information about the effects of rare variants in different populations than we have now. Coop concludes:
Undoubtedly the coming decades of human genomics will see breakthroughs in the identification of functional loci, the size of GWAS performed world-wide, and in the statistical methodologies used to understand trait variation. There is also no doubt that we will come to understand much more about human variation. However, our ability to perform GWAS to identify loci underlying variation in traits among individuals vastly outstrips our ability to understand the causal mechanisms underlying these differences. In many cases, genetic contributions may not be separable from environmental and cultural differences.12
The tea-drinking example illustrates just how thoroughly the old jigsaw-puzzle metaphor has been blown up. The process of mapping causal chains from genetic variation to phenotypic trait is immensely more complicated than that.
The Great Debate
Immensely more complicated, yes. But is it impossibly complicated? Seen from another perspective, the progress to date has been stunning. Polygenic scores didn’t even exist less than a decade ago. As I write, they already explain significant proportions of the variance in many traits, and progress is rapid. Consider educational attainment, a rough proxy measure for IQ, as an example. In just the five years from 2014 through 2018, the percentage of the variance that could be explained from genetic material alone went from zero to 15 percent.13 For some, the appropriate reaction is “Wow!” For others, 15 percent is not much, and the appropriate reaction is “So what?”
Two leading behavior geneticists whom you have already met have staked out opposite positions: Robert Plomin and Eric Turkheimer. I will sometimes subsequently refer to “the Plomin school” and “the Turkheimer school.” Other scholars have published on these issues, but I think it’s fair to say that Plomin and Turkheimer have published earlier and more prolifically on the positions they represent than anyone else.
They are in many ways a matched pair. Plomin and Turkheimer both obtained their PhDs in psychology at the University of Texas at Austin and studied under many of the same luminaries who were teaching there in the 1970s and 1980s. Both have published seminal articles using twin studies. Both have won prestigious awards. But when it comes to nature, nurture, and complex phenotypic traits, they might as well be on separate planets.
Robert Plomin and Polygenic Scores
“What would you think if you heard about a new fortune-telling device that is touted to predict psychological traits like depression, schizophrenia and school achievement?”14 That’s the opening sentence of Blueprint: How DNA Makes Us Who We Are, which Plomin published in 2018. He is referring to the advent of the polygenic score.
Polygenic scores are the most exciting and also the most controversial use of GWA data. They work like many other indexes—quarterback performance ratings, fielding averages in baseball, economic indexes predicting GDP growth, and IQ scores—that represent the aggregated score on several indicato
rs. Specifically, a polygenic score is the sum of the number of copies of the alleles that promote or intensify a given trait in an individual. In Blueprint, Robert Plomin offered a table of 10 hypothetical SNPs associated with a given trait to illustrate how the calculation works:
THE RAW MATERIAL FOR CALCULATING A POLYGENIC SCORE
SNP 1
Target allele: T
Allele 1: A
Allele 2: T
Genotypic score: 1
Correlation with trait: 0.005
Weighted genotypic score: 0.005
SNP 2
Target allele: C
Allele 1: G
Allele 2: G
Genotypic score: 0
Correlation with trait: 0.004
Weighted genotypic score: 0.000
SNP 3
Target allele: A
Allele 1: A
Allele 2: A
Genotypic score: 2
Correlation with trait: 0.003
Weighted genotypic score: 0.006
SNP 4
Target allele: G
Allele 1: C
Allele 2: G
Genotypic score: 1
Correlation with trait: 0.003
Weighted genotypic score: 0.003
SNP 5
Target allele: G
Allele 1: C
Allele 2: C
Genotypic score: 0
Correlation with trait: 0.003
Weighted genotypic score: 0.000
SNP 6
Target allele: T
Allele 1: A
Allele 2: T
Genotypic score: 1
Correlation with trait: 0.002
Weighted genotypic score: 0.002
SNP 7
Target allele: C
Allele 1: C
Allele 2: G
Genotypic score: 1
Correlation with trait: 0.002
Weighted genotypic score: 0.002
SNP 8
Target allele: A
Allele 1: A
Allele 2: A
Genotypic score: 2
Correlation with trait: 0.002
Weighted genotypic score: 0.004
SNP 9
Target allele: A
Allele 1: T
Allele 2: T
Genotypic score: 0
Correlation with trait: 0.001
Weighted genotypic score: 0.000
SNP 10
Target allele: C
Allele 1: C
Allele 2: G
Genotypic score: 1
Correlation with trait: 0.001
Weighted genotypic score: 0.001
Polygenic score
Target allele:
Allele 1:
Allele 2:
Genotypic score: 9
Weighted genotypic score: 0.023
Source: Adapted from Plomin (2018): Table 12.1.
Suppose you are a person whose genome has been sequenced and the target alleles for the ten SNPs in the table are associated with an increase in height. You want to know your polygenic score for height. For SNP 1, you have one copy of the target allele, so you enter 1 in the column labeled “Genotypic score.” For SNP 2, neither copy of your two alleles is the target allele, so you enter 0. For SNP 3, both copies are the target allele, so you enter 2. And so on. All told, you have 9 height-increasing alleles out of a possible 20. That’s the simple version of a polygenic score. The more sophisticated version is to multiply your score in the “Genotypic score” column by a weight. Plomin uses the correlation of the SNP with the trait (regression weights are also commonly used). Thus your “Weighted genotypic score” for SNP 1 is .005, greater than the weighted score for SNP 8, even though you have only one copy of the target allele in SNP 1 versus two copies for SNP 8. Add up all the weighted scores, and the weighted polygenic score is 0.023.
As you can see, neither the unweighted nor the weighted polygenic scores has a natural interpretation. The polygenic score can be interpreted only relative to a population. Fortunately, polygenic scores are normally distributed. Eventually—this achievement is probably some years down the road—we can hope for polygenic scores with means and standard deviations that can be interpreted in the same way that they are interpreted for IQ scores (which also have no natural interpretation in their raw form).
Polygenic scores are not limited to SNPs that meet the stringent requirement for genome-wide statistical significance. Plomin points out that the goal is the best composite score. “The new approach to polygenic scores is to keep adding SNPs as long as they add to the predictive power of the polygenic score in independent samples.… Some false positives will be included in the polygenic score but that is acceptable as long as the signal increases relative to the noise, in the sense that the polygenic score predicts more variance.”15
Plomin sees polygenic scores as a game changer for three reasons:
Predictions from polygenic scores to psychological traits are causal in just one direction (the trait cannot be a cause of the score).
Polygenic scores can predict from birth.
Polygenic scores can predict differences between family members, something that twin studies cannot do.
Unweighted polygenic scores have a few other advantages as well. Unlike psychometric measures, which yield somewhat different results when a person is tested more than once, polygenic scores from carefully analyzed DNA samples have 100 percent test-retest reliability. They cannot be influenced by self-esteem, stereotype threat, growth mindset, coaching, or whether the subject got a good night’s sleep before giving the DNA sample.
Plomin expects polygenic scores to transform both clinical psychology and psychology research.16 With regard to clinical psychology, he foresees five such changes:
Polygenic scores will be able to identify the genetic risk that an individual faces for a given disorder before the problem has developed. Psychologists will no longer be confined to observing symptoms and diagnosing problems after they manifest themselves.
Clinical psychology will move away from diagnoses and toward dimensions. One of the revelations of recent research is that polygenic scores are normally distributed, thereby demonstrating that genetic risk for psychological problems is continuous. There is no gene that moves a person from normal to psychologically disordered. In fact, the words “risk” and “disorder” no longer have the same meanings they once did. “There are no disorders to diagnose and there are no disorders to cure. Polygenic scores will be used to index problems quantitatively rather than deciding whether someone ‘has’ a disorder.”17
Polygenic scores will enable clinical psychology to create more precise treatments. They will be especially useful for choosing the right drugs and dosages based on genetic evidence—and, as importantly, avoiding the expense and side effects of trying wrong drugs and dosages.
Clinical psychology’s focus will shift from treatment toward prevention. Clinical psychologists have no effective broad-based, large-scale prevention strategies. But when we know from polygenic scores that an individual is at risk, we can design, test, and eventually identify effective prevention strategies for individuals.
Polygenic scores will promote “positive genomics.” A normal distribution has two tails, and that is as true of psychological states as of any other normally distributed phenomenon. Clinical psychology focuses on the left-hand, negative tail. Knowing where a person stands on the continuum for certain traits can make it easier to identify ways to focus “on strengths instead of problems, abilities rather than disabilities, and resiliencies instead of vulnerabilities.”18 Polygenic scores will also encourage more attention to the right-hand tail of the distribution, which for many traits can have its own problems—perhaps, for example, the opposite of being at high risk for bipolar disorder is not sunny emotional stability, but instead a flat affect that leaves a person unable to experience the highs and lows of life. What is the sweet spot—the operationalization of Aristotle’s golden mean—for a psychological t
rait? We’re going to learn far more about such things as polygenic scores become available.
Psychology research will be similarly transformed, Plomin argues, as polygenic scores make it possible for researchers to ask questions about nature and nurture with far greater precision and sophistication than in the past. Furthermore, the number of researchers who can participate in the research will increase manyfold. Until now, only researchers who had access to databases of twins and adoptees could ask questions about the roles of nature and nurture. Now, researchers can use any database that includes genomic information to do such analyses, and the number of such databases is growing rapidly.
The study of “generalist” genes will be opened up. Researchers have already identified what appears to be a general genetic factor of psychopathology, finding polygenic score correlations of +0.50 or more for schizophrenia, major depressive disorder, and bipolar disorder. The general factor of intelligence, g, is being informed by GWA studies. More broadly, researchers will be able to develop polygenic scores that investigate the genetic links among multiple traits, eventually building a picture of their overall genomic architecture.
All the questions about the relative roles of nature and nurture that twin studies have addressed can be revisited with greater precision. “Polygenic scores can be used to nail down genetic influence on the variance of environmental measures and on their covariance with psychological measures. They can also control for genetic influence in order to study purer environmental effects.”19 And that’s just the beginning of the G×E interactions that polygenic scores allow researchers to explore.
Eric Turkheimer’s Phenotypic Null Hypothesis
“Science is about causes, period.”20 That’s the first sentence in an Eric Turkheimer article about Plomin’s work on the shared and nonshared environment.21 It captures the fundamental difference between the approaches of the two men. Plomin focuses on predictive validity while Turkheimer focuses on ultimate causes.
Human Diversity Page 34