by David Reich
The Genomic Rehabilitation of Joseph Greenberg
The genetic discovery of the spread of the First Americans also helps to resolve a linguistic controversy. The extraordinary diversity of Native American languages had been noted as early as the seventeenth century, with some European missionaries attributing it to the devil’s efforts to resist the conversion of Native populations by making the language that missionaries needed to learn to proselytize to one population useless for proselytizing to the next. Linguists can be divided into “splitters,” who emphasize differences among languages, and “lumpers,” who emphasize their common roots. One of the most extreme splitters was Lyle Campbell, who divided about one thousand Native American languages into about two hundred families (groups of related languages), sometimes even localized to particular river valleys.32 One of the most extreme lumpers was Joseph Greenberg, who argued that he could group all Native American languages into just three families, the deep connections of which he could trace. He argued that these three families reflected three great waves of migration from Asia.
Campbell and Greenberg clashed famously in their interpretation of Native American language relationships, with Campbell finding Greenberg’s tripartite classification so objectionable that he wrote in 1986 that Greenberg’s classification “should be shouted down.”33 In fact, two of the language families are indisputable: Eskimo-Aleut languages spoken by many of the indigenous peoples of Siberia, Alaska, northern Canada, and Greenland, and Na-Dene languages spoken by a subset of the Native American tribes living on the Pacific coast of northern North America, in the interior of northern Canada, and in the southwestern United States.
But it was Greenberg’s third family, “Amerind,” which he claimed includes about 90 percent of the languages of Native Americans, that so many linguists found objectionable. The method that Greenberg used to propose Amerind was to study several hundred words across different Native American languages and to score them according to the extent to which they were shared. By finding high rates of sharing, he claimed evidence for common origin. As he saw it, proto-Amerind was spoken by the first Americans south of the ice sheets. Because he found that every non-Na-Dene and non-Eskimo-Aleut language throughout the Americas could be classified as Amerind using this approach, he concluded that the language data supported a theory of three great waves of Native American dispersal from Asia. If there had been another wave, it would have left another distinct set of languages.
The critique of Greenberg’s ideas that followed was withering. Critics argued that the list of words was too brief to establish commonality. Critics also disputed the claim that these words truly stemmed from common roots. Identification of shared words is thought to become difficult for time depths of more than a few thousand years because languages change so fast, but Greenberg was claiming to detect links at twice this time depth.
But Greenberg got something right. His category of Amerind corresponds almost exactly to the First American category found by genetics. The clusters of populations that he predicted to be most closely related based on language were in fact verified by the genetic patterns in populations for which data are available. And the present-day balkanization of Native American languages also reflects a history in which the great majority of populations descend from a single migratory spread. Anyone looking at a language map of the Americas can see that its appearance is qualitatively different from that of Eurasia or Africa, with dozens of language families restricted to small territories, compared to the vast swaths of territory in Eurasia and Africa inhabited by people who speak closely related tongues in the Indo-European, Austronesian, Sino-Tibetan, and Bantu language families, each of which reflects a history of mass migrations and population replacements. The First American expansion seems to have been so fast that the languages of the continent are related by a rake-like structure with many tines extending in parallel to a common root that dates close to the time of the early settlement of the Americas.34 So both the genetic and linguistic evidence support a scenario in which many of the present-day Native American populations are direct descendants of populations that plausibly lived in the same region shortly after the first peopling of the continent. This suggests that after the initial dispersal, population replacement was more infrequent in the Americas than it was in Africa and Eurasia.
Figure 20. This simplified tree relates the three groupings of Native American populations hypothesized by Joseph Greenberg based on linguistic data. The groupings correspond to three distinct entries into the Americas, but Greenberg did not know about the high proportions of First American ancestry in all groups: about 90 percent in Na-Dene speakers and about 60 percent in Eskimo-Aleut speakers.
While the genetic data provided a large measure of confirmation for Greenberg’s broad picture, he missed something important. Although Eskimo-Aleut and Na-Dene speakers are genetically distinguishable from other Native Americans because they carry ancestry from distinct streams of migration from Asia, both have large amounts of First American ancestry: around a 60 percent mixture proportion in the case of the Eskimo-Aleut speakers we studied, and around a 90 percent proportion in the case of some Na-Dene speakers.35 So while Greenberg’s three predicted language groups correlate well with three ancient populations, First Americans have made a dominant demographic contribution to all present-day indigenous peoples in the Americas.
Population Y
The next card dealt from the genetic deck was a complete surprise—at least to us geneticists.
Some physical anthropologists studying the shapes of human skeletons had for years been asserting that there are some American skeletons, dating to before ten thousand years ago, that do not look like what one would expect for the ancestors of today’s Native Americans. The most iconic is Luzia, an approximately 11,500-year-old skeleton whose remains were found in Lapa Vermelha, Brazil, in 1975. Many anthropologists find the shape of her face more similar to those of indigenous peoples from Australia and New Guinea than to those of ancient or modern peoples of East Asia, or Native Americans. This puzzle led to speculation that Luzia came from a group that preceded Native Americans. Anthropologist Walter Neves has identified dozens of Mesoamerican and South American skeletons with what he calls a “Paleoamerican” morphology. Exhibit number one for Neves is a set of fifty-five skulls dating to ten thousand years ago or more from a prehistoric garbage dump at Lagoa Santa in Brazil.36
These claims are controversial. Morphological traits vary depending on diet and environment, and after the arrival of humans in the Americas, natural selection as well as random changes that accumulate in populations over time may have contributed to morphological change. The experience of Kennewick Man, whose skeleton has morphological affinities to those of Pacific Rim populations but genetically is derived entirely from the same ancestral population as other Native Americans, serves as a great warning—an object lesson about the danger of interpreting morphology as strong evidence of population relationships.37 Many have criticized Neves by suggesting that his analyses were statistically flawed, in that he chose which sites to include in his analysis in order to strengthen his Paleoamerican idea and deliberately left out those that did not fit, an approach inconsistent with rigorous science.
Figure 21. Despite extraordinary geographic distance, populations in the Amazon share ancestry with Australians, New Guineans, and Andamanese to a greater extent than with other Eurasians. This may reflect an early movement of humans into the Americas from a source population that is no longer substantially represented in northeast Asia.
Nonetheless, Pontus Skoglund decided to inspect Native American genetic data more closely, looking for traces of ancestry different from the First Americans. His logic went as follows. If there were ancient people on the continent who were displaced by First Americans, they may have mixed with the ancestors of present-day populations, leaving some statistical signal in the genomes of people living today.
Skoglund undertook a Four Population Test to compare all p
ossible pairs of populations from the Americas that we had previously thought were entirely of First American ancestry to all possible pairs of populations outside the Americas, among them indigenous people from Australasia (including Andaman Islanders, New Guineans, and Australians) and other populations hypothesized by some anthropologists to be related to Paleoamericans. He found two Native American populations, both from the Amazon region of Brazil, that are more closely related to Australasians than to other world populations. After joining my laboratory as a postdoctoral scientist, Skoglund found weaker signals of genetic affinity to Australasians, but still probably real, in other Native American populations ringing the Amazon basin. He estimated that the proportion of ancient ancestry in these populations was small—1 to 6 percent—with the rest being consistent with First American ancestry.38
Skoglund and I were initially skeptical about these findings, but the statistical evidence just kept getting stronger. We saw the same patterns in multiple independently collected datasets. We also showed that these patterns could not arise as a result of recent migrations from Asian populations—while Amazonians had their strongest affinity to indigenous people from Australia, New Guinea, and the Andaman Islands (compared to East Asians as a baseline), they were not particularly close to any of them. Also contradicted by the genetic data was a Polynesian migration from the Pacific across to the Americas. While such a migration could have reasonably occurred over the past couple of thousand years as Polynesians mastered the technology of transoceanic travel, the affinities we found had nothing in common with Polynesians. It really looked like evidence of a migration into the Americas of an ancient population more closely related to Australians, New Guineans, and Andamanese than to present-day Siberians. We concluded that we had found evidence of a “ghost” population: a population that no longer exists in unmixed form. We called this “Population Y” after the word ypykuéra, meaning “ancestor” in Tupí, the language family of the populations with the largest proportions of this ancestry.
The Tupí-speaking population in which we found the most Population Y ancestry was the Suruí, the authors of the origin myth that begins this chapter. They now number about fourteen hundred people and live in the Brazilian state of Rondônia.39 They have been relatively isolated, establishing formal relations with the government of Brazil only in the 1960s when road builders came through their territory. Since then, the Suruí have defended their land from deforestation, taken over coffee plantations, and reported illegal loggers and miners. They have sought representation from indigenous rights groups in the United States and claimed carbon credits for the greenhouse gases conserved through the rainforest they have protected.
Another group belonging to the Tupí language family in which we found Population Y ancestry is the Karitiana. The Karitiana are discussed at the beginning of this chapter as one of the first Native American tribes to become active in protesting against genetic research—in their case because of concern that DNA samples had been taken from them in 1996 with a promise of improved access to health care that has never been realized. The Karitiana are around three hundred strong and also come from Rondônia. The samples we analyzed were not part of this tainted 1996 sampling but instead from a 1987 sampling in which informed consent procedures consistent with the ethical standards of the time appear to have been followed. I hope that the Karitiana individuals who encounter our findings will welcome these observations about their distinctive ancestry as a positive discovery that highlights benefits that can come from engaging in scientific studies.40
The third population in which we found substantial Population Y ancestry is the Xavante, who speak a language of the Ge group, which is different from the Tupí language group spoken by the Suruí and Karitiana. They number around eighteen thousand people and are located in Brazil’s Mato Grosso state, on the Brazilian plateau. They have been forcibly relocated, their land today suffers from environmental degradation, and their indigenous way of life is constantly under threat from development.41
We found little or no Population Y ancestry in Mesoamerica or in South Americans to the west of the high Andes. We also did not detect Population Y ancestry in the almost thirteen-thousand-year-old genome of the Clovis culture infant from the northern United States, or in present-day Algonquin speakers from Canada. The Population Y geographic distribution is largely limited to Amazonia, providing yet more evidence for an ancient origin. The fact that Population Y ancestry is restricted to difficult terrain far from the Bering link to Asia is perhaps what one would expect from an original pioneering population that was once more broadly distributed and was then marginalized by the expansion of other groups. This pattern mirrors the distribution of some other language families—for example, the Tuu, Kx’a, and Khoe-Kwadi languages spoken by the Khoe and San in southern Africa—where islands of these speakers in rugged terrain are surrounded by seas of people speaking other languages.
The fact that the strongest statistical evidence of the ancient lineage we detect is in Brazil, the home of “Luzia” and the Lagoa Santa skeletons, is remarkable, but does not prove that the ancient lineage we discovered coincides with the “Paleoamerican” morphology hypothesized by Neves and others. Neves claimed to see the Paleoamerican morphology not only in ancient Brazilians but also in ancient and relatively recent Mexicans, and yet we found no hint of a signal in Mexicans. In addition, Eske Willerslev’s group obtained DNA from two Native American groups that had skeletal morphology typical of Paleoamericans according to Neves: Pericúes in the Baja California peninsula of northwestern Mexico and Fuegans in the southern tip of South America. Neither of these groups carried Population Y ancestry.42
What, then, does the genetic pattern mean? We already know from archaeology that humans probably arrived south of the ice sheets before the opening of the ice-free corridor, leaving remains at archaeological sites including Monte Verde and the Paisley Caves. But the big population explosion, marked by the Clovis people, only occurred once the ice-free corridor had opened. The genetic data could be giving evidence of early peopling of the Americas by a minimum of two very different groups moving in from Asia, perhaps along two different routes and at different times. If Population Y spread through parts of South America before the First Americans, then it seems likely that after this initial peopling, the First Americans advanced into nearly all of the territories the Population Y people had already visited, replacing them either completely or only partially, as in Amazonia. Population Y ancestry may have survived better in Amazonia than it did elsewhere because of the relative impenetrability of the Amazonian environment. This could have slowed down the movement of First Americans into the region enough to allow people living there to mix with the new migrants rather than simply being replaced.
The Australasian-related ancestry in the Suruí today amounts to a small percentage—about the same as the Neanderthal ancestry in all non-Africans—but it would be unwise to dismiss its importance. This is because the impact of Population Y on Amazonians may be much greater than 2 percent. The ancestors of Population Y had to traverse enormous spaces in Siberia and northern North America where the ancestors of First Americans were also living. It is likely that Population Y was already mixed with large amounts of First American–related ancestry when it started expanding into South America. If so, then the ancestry derived from a lineage related to southern Asians is only a kind of “tracer dye” for Population Y ancestry—like the heavy metals injected into patients’ veins in hospitals to track the paths of their blood vessels in a CT scan. Our estimate of around 2 percent Population Y ancestry in the Suruí is based on the assumption that Population Y traversed the entirety of Northeast Asia and America without mixing with other people it encountered. If we allow for the likelihood that there was mixture with populations related to First Americans on the way, the proportion of Population Y in the Suruí could be as high as 85 percent and still produce the observed statistical evidence of relatedness to Australasians. If the true proportion
is even a fraction of this, then the story of First Americans expanding into virgin territory is profoundly misleading. Instead, we need to think in terms of an expansion of a highly substructured founding population of the Americas. The history and timing of the arrival of Population Y in the Americas is likely to be resolved only with recovery of ancient DNA from skeletons with Population Y ancestry.
After the First Americans
The great promise of genetic data lies not only in what they can tell us about the deepest origins of Native Americans but also in what genetic data has to say about more recent times and how populations got to be the way they are today.
A prime example is insight into the origin of speakers of Na-Dene languages, who live along the Pacific coast of North America, in parts of northern Canada, and as far south as Arizona in the United States. The overwhelming consensus among linguists is that these languages stem from an ancestral language no more than a few thousand years old, and that their dispersal over this vast range in northwestern America must have been driven at least in part by migrations. In an astonishing development in 2008, the American linguist Edward Vajda documented a deeper connection between Na-Dene languages and a language family of central Siberia called Yeniseian, once spoken by many populations, though today only the Ket language of the Yeniseian family is still used on a day-to-day basis.43 These results suggest that despite the enormous distance, a relatively recent migration from Asia gave rise to Na-Dene speakers in the Americas.