by David Reich
Whatever explains these patterns, it is clear that we have much more to learn. The period before fifty thousand years ago was a busy time in Eurasia, with multiple human populations arriving from Africa beginning at least 1.8 million years ago. These populations split into sister groups, diverged, and mixed again with each other and with new arrivals. Most of those groups have since gone extinct, at least in their “pure” forms. We have known for a while, from skeletons and archaeology, that there was some impressive human diversity prior to the migration of modern humans out of Africa. However, we did not know before ancient DNA was extracted and studied that Eurasia was a locus of human evolution that rivaled Africa. Against this background, the fierce debates about whether modern humans and Neanderthals interbred when they met in western Eurasia—which have been definitively resolved in favor of interbreeding events that made a contribution to billions of people living today—seem merely anticipatory. Europe is a peninsula, a modest-sized tip of Eurasia. Given the wide diversity of Denisovans and Neanderthals—already represented in DNA sequences from at least three populations separated from each other by hundreds of thousands of years, namely Siberian Denisovans, Australo-Denisovans, and Neanderthals—the right way to view these populations is as members of a loosely related family of highly evolved archaic humans who inhabited a vast region of Eurasia.
Ancient DNA has allowed us to peer deep into time, and forced us to question our understanding of the past. If the first Neanderthal genome published in 2010 opened a sluice in the dam of knowledge about the deep past, the Denisova genome and subsequent ancient DNA discoveries opened the floodgates, producing a torrent of findings that have disrupted many of the comfortable understandings we had before. And that was only the beginning.
Part II
How We Got to Where
We Are Today
4
Humanity’s Ghosts
The Discovery of the Ancient North Eurasians
When confronted with the diversity of life, evolutionary biologists are drawn to the metaphor of a tree. Charles Darwin, at the inception of the field, wrote: “The affinities of all the beings of the same class have sometimes been represented by a great tree….The green and budding twigs may represent existing species….The limbs divided into great branches, and these into lesser and lesser branches, were themselves once, when the tree was small, budding twigs.”1 Present populations budded from past ones, which branched from a common root in Africa. If the tree metaphor is right, then any population today will have a single ancestral population at each point in the past. The significance of the tree is that once a population separates, it does not remix, as fusions of branches cannot occur.
The avalanche of new data that has become available in the wake of the genome revolution has shown just how wrong the tree metaphor is for summarizing the relationship among modern human populations. My closest collaborator, the applied mathematician Nick Patterson, developed a series of formal tests to evaluate whether a tree model is an accurate summary of real population relationships. Foremost among these was the Four Population Test, which, as described in part I, examines hundreds of thousands of positions on the genome where individuals vary—for example, where some people have an adenine (one of the four nucleic acids or “letters” of DNA) and others have a guanine—reflecting a mutation that occurred deep in the past. If a set of four populations is described by a tree, then the frequencies of their mutations are expected to have a simple relationship.2
The most natural way to test the tree model is to measure the frequencies of mutations in the genomes of two populations that we hypothesize have split from the same branch. If a tree model is correct, the frequencies of mutations in the two populations will have changed randomly since their separation from the other two more distantly related populations, and so the frequency differences between these two pairs of populations will be statistically independent. If a tree model is wrong, there will be a correlation between the frequency differences, pointing to the likelihood of mixture between the branches. The Four Population Test was central to our demonstration that Neanderthals are more closely related to non-Africans than to Africans, and thus that there was interbreeding between Neanderthals and non-Africans.3 But findings about interbreeding between archaic and modern humans are only a small part of what has been discovered with Four Population Tests.
My laboratory’s first major discovery using the Four Population Test came when we tested the widely held view that Native Americans and East Asians are “sister populations” that descend from a common ancestral branch that separated earlier from the ancestors of Europeans and sub-Saharan Africans. To our surprise, we found that at mutations not shared with sub-Saharan Africans, Europeans are more closely related to Native Americans than they are to East Asians. It would be tempting to argue that this observation has a trivial explanation, such as Native Americans having some ancestry from European migrants over the last five hundred years. But we found the same pattern in every Native American population we studied, including those we could prove had no European admixture. The scenario of Native Americans and Europeans descending from a common population that split earlier from East Asians was also contradicted by the data. Something was deeply wrong with the standard tree model of population relationships.
We wrote a paper describing these results, suggesting that the patterns reflect an episode of mixture deep in the ancestry of Native Americans: a coming together of people related to Europeans and people related to East Asians prior to crossing the Bering land bridge between Asia and the Americas. We submitted this paper, “Ancient Mixture in the Ancestry of Native Americans,” in 2009. It was accepted pending minor revisions, but as it turns out, we never published it.
Even as we were making our final revisions to that paper, Patterson discovered something even stranger, which made us realize we had understood only part of the story.4 To explain his discovery, I need to describe another statistical test we devised, the Three Population Test, which evaluates a “test” population for evidence of mixture. If the test population is a mixture of lineages related to the comparison populations in two different ways—as African Americans are a mixture of Europeans and West Africans—then the frequencies of the test population’s mutations are expected to be intermediate between those of the two comparison populations. In contrast, if mixture did not occur, there is no reason to expect the frequencies of mutations in the population to be intermediate. Thus the scenarios of mixture and no mixture yield two qualitatively very different patterns.
When we applied the Three Population Test to diverse human populations, we detected negative statistics when the test population was northern European, proving that population mixture occurred in the ancestors of northern Europeans. We tried all possible pairs of comparison populations from more than fifty worldwide populations and found that the mixture evidence was strongest when one comparison population was southern European, especially Sardinians, and the other was Native Americans. It was clearly Native American populations that produced the most negative values, as we found that the statistic was more negative when we used Native Americans for the second comparison population than when we used East Asians, Siberians, or New Guineans. What we had found was evidence that people in northern Europe, such as the French, are descended from a mixture of populations, one of which shared more ancestry with present-day Native Americans than with any other population living today.
How could we understand the results of both the Three Population Test and the Four Population Test? We proposed that more than fifteen thousand years ago, there was a population living in northern Eurasia that was not the primary ancestral population of the present-day inhabitants of the region. Some people from this population migrated east across Siberia and contributed to the population that crossed the Bering land bridge and gave rise to Native Americans. Others migrated west and contributed to Europeans. This would explain why today, the evidence of mixture in Europeans is strong when using Native
Americans as a surrogate for the ancestral population and not as strong in indigenous Siberians, who plausibly descend from more recent, post–ice age migrations into Siberia from more southern parts of East Asia.
We called this proposed new population the “Ancient North Eurasians.” At the time we proposed them, they were a “ghost”—a population that we can infer existed in the past based on statistical reconstruction but that no longer exists in unmixed form. The Ancient North Eurasians would without a doubt have been called a “race” had they lived today, as we could show that they must have been genetically about as differentiated from all other Eurasian populations who lived at the time as today’s “West Eurasians,” “Native Americans,” and “East Asians” are from one another. Although they have not left unmixed descendants, the Ancient North Eurasians have in fact been extraordinarily successful. If we put together all the genetic material that they have contributed to present-day populations, they account for literally hundreds of millions of genomes’ worth of people. All told, more than half the world’s population derives between 5 percent and 40 percent of their genomes from the Ancient North Eurasians.
The case of the Ancient North Eurasians showed that while a tree is a good analogy for the relationships among species—because species rarely interbreed and so like real tree limbs are not expected to grow back together after they branch5—it is a dangerous analogy for human populations. The genome revolution has taught us that great mixtures of highly divergent populations have occurred repeatedly.6 Instead of a tree, a better metaphor may be a trellis, branching and remixing far back into the past.7
The Ghost Is Found
At the end of 2013, Eske Willerslev and his colleagues published genome-wide data from the bones of a boy who had lived at the Mal’ta site in south-central Siberia around twenty-four thousand years ago.8 The Mal’ta genome had its strongest genetic affinity to Europeans and Native Americans, and far less affinity to the Siberians who live in the region today—just as we had predicted for the ghost population of the Ancient North Eurasians. The Mal’ta genome has now become the prototype sample for the Ancient North Eurasians. Paleontologists would call it a “type specimen,” the individual used in the scientific literature to define a newly discovered group.
With the Mal’ta genome in hand, the other pieces of the puzzle snapped into place. It was no longer necessary to reconstruct from present-day populations what had happened long ago. Instead, with a genome sampled directly from the ghost population, it was possible to understand migrations and population admixtures from tens of thousands of years ago as if we were analyzing recent history. What became possible with the Mal’ta genome is the best example I know of the power of ancient DNA to uncover history that until then could only be dimly perceived from present-day data.
The analysis of the Mal’ta genome made it clear that Native Americans derive about a third of their ancestry from the Ancient North Eurasians, and the remainder from East Asians. It is this major mixture that explains why Europeans are genetically closer to Native Americans than they are to East Asians. Our unpublished manuscript claiming that Native Americans descend from a mixture of East Asian and West Eurasian related lineages had been correct, but it was just not the whole story; the paper was overtaken by events in the fast-moving field of ancient DNA. What Willerslev and colleagues found went far beyond what we had been able to do by relying on only modern populations. The Willerslev team not only proved that Native Americans issued from population mixture—which we had not succeeded in doing as we could not rule out an alternative scenario—but they also showed that the mixture was part of a larger story.
The finding that several of the great populations outside of Africa today are profoundly mixed was at odds with what most scientists expected. Prior to the genome revolution, I, like most others, had assumed that the big genetic clusters of populations we see today reflect the deep splits of the past. But in fact the big clusters today are themselves the result of mixtures of very different populations that existed earlier. We have since detected similar patterns in every population we have analyzed: East Asians, South Asians, West Africans, southern Africans. There was never a single trunk population in the human past. It has been mixtures all the way down.
The Ghost of the Near East
Throughout 2013, Iosif Lazaridis in my laboratory was troubled by a result that could not be understood without ancient DNA.
Lazaridis was trying to understand a peculiar Four Population Test result showing that East Asians, present-day Europeans, and pre-farming European hunter-gatherers from around eight thousand years ago are not related to one another according to the tree model. Instead, his analysis showed that East Asians today are genetically more closely related on average to the ancestors of ancient European hunter-gatherers than they are to the ancestors of present Europeans. Ancient DNA studies prior to his work had already shown that present-day Europeans derive some of their ancestry from migrations of farmers from the Near East, who I had assumed were derived from the same ancestral population as European hunter-gatherers. Lazaridis now realized that the ancestry of the first European farmers was distinct from European hunter-gatherers in some way. Something more complicated was going on.
Lazaridis weighed two alternative explanations. One explanation was that there was mixture between the ancestors of ancient European hunter-gatherers and ancient East Asians, bringing these two populations together genetically. There are no insurmountable geographic barriers between Europe and East Asia, so this was a distinct possibility. The alternative explanation was that early European farmers who contributed much of the DNA to present-day Europeans derived some of their ancestry from a population that split early from the main group that peopled Eurasia. This would render East Asians less similar to present-day Europeans than they are to pre-farming European hunter-gatherers.
Once the genome sequence from Mal’ta became available, Lazaridis instantly solved the problem.9 With Mal’ta in hand, he carried out Four Population Tests among various sets of four populations. Mal’ta and the pre-farming European hunter-gatherers appeared to descend from a common ancestral population that arose after the separation from East Asians and sub-Saharan Africans. The data were consistent with a simple tree. But when Lazaridis replaced ancient European hunter-gatherers in this statistic with either present-day Europeans or with early European farmers, the tree metaphor could no longer describe the data. Present-day Europeans and Near Easterners are mixed: they carry within them ancestry from a divergent Eurasian lineage that branched from Mal’ta, European hunter-gatherers, and East Asians before those three lineages separated from one another.
Lazaridis called this lineage “Basal Eurasian” to denote its position as the deepest split in the radiation of lineages contributing to non-Africans. The Basal Eurasians were a new ghost population, one as important as the Ancient North Eurasians, measured by the sheer number of descendant genomes they have left behind. The extent of the deviations of the Four Population Test away from the value of zero that would be expected if the populations were related by a simple tree indicates that this ghost population contributed about a quarter of the ancestry of present-day Europeans and Near Easterners. It also contributed comparable proportions of ancestry to Iranians and Indians.
No one has yet collected ancient DNA from the Basal Eurasians. Finding such a sample is at present one of the holy grails in the field of ancient DNA, just as finding the Ancient North Eurasians had been before the Mal’ta discovery. But we know that Basal Eurasians existed. And even without having their ancient DNA, we know important facts about them based on the genomic fragments they have left behind in samples for which we do have data.
An extraordinary feature of the Basal Eurasians compared to all other lineages that have contributed to present-day people outside of Africa is that they harbored little or no Neanderthal ancestry. In 2016, we analyzed ancient DNA from the Near East to show that people who lived in the region fourteen thousand to ten thous
and years ago had approximately 50 percent Basal Eurasian ancestry, about twice the proportion in Europeans today. Plotting the proportion of Basal Eurasian ancestry against the proportion of Neanderthal ancestry, we realized that the less Basal Eurasian ancestry a non-African person has, the more Neanderthal ancestry he or she has. Thus non-Africans who have zero percent Basal Eurasian ancestry have twice as much Neanderthal DNA as ones with 50 percent Basal Eurasian ancestry. By extrapolation, we might expect 100 percent Basal Eurasians to have no Neanderthal ancestry at all.10 So wherever the Neanderthal admixture occurred, it seems to have largely happened after the other branches of the non-African family tree separated from Basal Eurasians.