by David Reich
The genetic data told a clear story. Around a third of Indian groups experienced population bottlenecks as strong or stronger than the ones that occurred among Finns or Ashkenazi Jews. We later confirmed this finding in an even larger dataset that we collected working with Thangaraj: genetic data from more than 250 jati groups spread throughout India.38
Many of the population bottlenecks in India were also exceedingly old. One of the most striking we discovered was in the Vysya of the southern Indian state of Andhra Pradesh, a middle caste group of approximately five million people whose population bottleneck we could date (from the size of segments shared between individuals of the same population) to between three thousand and two thousand years ago.
The observation of such a strong population bottleneck among the ancestors of the Vysya was shocking. It meant that after the population bottleneck, the ancestors of the Vysya had maintained strict endogamy, allowing essentially no genetic mixing into their group for thousands of years. Even an average rate of influx into the Vysya of as little as 1 percent per generation would have erased the genetic signal of a population bottleneck. The ancestors of the Vysya did not live in geographic isolation. Instead, they lived cheek by jowl with other groups in a densely populated part of India. Despite proximity to other groups, the endogamy rules and group identity in the Vysya have been so strong that they maintained strict social isolation from their neighbors, and transmitted that culture of social isolation to each and every subsequent generation.
And the Vysya were not unique. A third of the groups we analyzed gave similar signals, implying thousands of groups in India like this. Indeed, it is even possible that we were underestimating the fraction of groups in India affected by strong long-term endogamy. To show a signal, a group needed to have gone through a population bottleneck. Groups that descended from a larger number of founders but nevertheless maintained strict endogamy ever since would go undetected by our statistics. Rather than an invention of colonialism as Dirks suggested, long-term endogamy as embodied in India today in the institution of caste has been overwhelmingly important for millennia.
Learning this feature of Indian history had a strong resonance for me. When I started my work on Indian groups, I came to it as an Ashkenazi Jew, a member of an ancient caste of West Eurasia. I was uncomfortable with my affiliation but did not have a clear sense of what I was uncomfortable about. My work on India crystallized my discomfort. There is no escaping my background as a Jew. I was raised by parents whose highest priority was being open to the secular world, but they themselves had been raised in a deeply religious community and were children of refugees from persecution in Europe that left them with a strong sense of ethnic distinctiveness. When I was growing up, we followed Jewish dietary rules at home—I believe my parents did so in part in the hope that their own families would feel comfortable eating at our house—and I went for nine years to a Jewish school and spent many summers in Jerusalem. From my parents as well as from my grandparents and cousins I imbibed a strong sense of difference—a feeling that our group was special—and a knowledge that I would cause disappointment and embarrassment if I married someone non-Jewish (a conviction that I know also had a powerful effect on my siblings). Of course, my concern about disappointing my family is nothing compared to the shame, isolation, and violence that many expect in India for taking a partner outside their group. And yet my perspective as a Jew made me empathize strongly with all the likely Romeos and Juliets over thousands of years of Indian history whose loves across ethnic lines have been quashed by caste. My Jewish identity also helped me to understand on a visceral level how this institution had successfully perpetuated itself for so long.
What the data were showing us was that the genetic distinctions among jati groups within India were in many cases real, thanks to the long-standing history of endogamy in the subcontinent. People tend to think of India, with its more than 1.3 billion people, as having a tremendously large population, and indeed many Indians as well as foreigners see it this way. But genetically, this is an incorrect way to view the situation. The Han Chinese are truly a large population. They have been mixing freely for thousands of years. In contrast, there are few if any Indian groups that are demographically very large, and the degree of genetic differentiation among Indian jati groups living side by side in the same village is typically two to three times higher than the genetic differentiation between northern and southern Europeans.39 The truth is that India is composed of a large number of small populations.
Indian Genetics, History, and Health
The groups of European ancestry that have experienced strong population bottlenecks—Ashkenazi Jews, Finns, Hutterites, Amish, French Canadians of the Saguenay–Lac-St.-Jean region, and others—have been the subject of endless and productive study by medical researchers. Because of their population bottlenecks, rare disease-causing mutations that happened to have been carried in the founder individuals have dramatically increased in frequency. Rare mutations that are innocuous when a person inherits a copy from only one of their parents—they act recessively, which means that two copies are required to cause disease—can be lethal when a person inherits copies from both parents. However, once these mutations increase in frequency due to a population bottleneck, there is an appreciable chance that individuals in the population will inherit the same mutation from both of their parents. For example, in Ashkenazi Jews there is a high incidence of the devastating disease of Tay-Sachs, which causes brain degeneration and death within the first few years of life. One of my first cousins died within months of birth due to an Ashkenazi founder disease called Zellweger syndrome, and one of my mother’s first cousins died young of Riley-Day syndrome, or familial dysautonomia, another Ashkenazi founder disease. Hundreds of such diseases have been identified, and the responsible genes have been identified in European founder populations, including Ashkenazi Jews. These findings have led to important biological insights and in a few cases to the development of drugs that counteract the effect of the damaged genes.
India, of course, has far more people who belong to groups that experienced strong bottlenecks, as the country’s population is huge, and as around one-third of Indian jati groups descend from bottlenecks as strong or stronger than those that occurred in Ashkenazi Jews or Finns. Searches for the genes responsible for disorders in these Indian groups therefore have the potential to identify risk factors for thousands of diseases. Despite the fact that no one has systematically looked, a few such cases are already known. For example, the Vysya are known to have a high rate of prolonged muscle paralysis in response to muscle relaxants given prior to surgery. As a result, clinicians in India know not to give these drugs to people of Vysya ancestry. The condition is due to low levels of the protein butylcholinesterase in some Vysya. Genetic work has shown that this condition is due to a recessively acting mutation that occurs at about 20 percent frequency in the Vysya, a far higher rate than in other Indian groups, presumably because the mutation was carried in one of the Vysya’s founders.40 This frequency is sufficiently high that the mutation occurs in two copies in about 4 percent of the Vysya, causing disastrous reactions for people who carry the mutation and go under anesthesia.
As the Vysya example demonstrates, the history of India presents an important opportunity for biological discovery, as finding genes for rare recessive diseases is cheap with modern genetic technology. All it takes is access to a small number of people in a jati group with the disease, whose genomes can then be sequenced. Genetic methods can identify which of the thousands of groups in India have experienced strong population bottlenecks. Local doctors and midwives can identify syndromes that occur at high rates in specific groups. It is surely the case that local doctors, having delivered thousands of babies, will know that certain diseases and malformations occur more frequently in some groups than in others. This is all the information one needs to collect a handful of blood samples for genetic analysis. Once these samples are in hand, the genetic work to find t
he responsible genes is straightforward.
The opportunities for making a medical difference in India through surveys of rare recessive disease are particularly great because arranged marriage is very common. Much as I find restrictions on marriage discomfiting, arranged marriages are a fact in numerous communities in India—as they are in the ultra-Orthodox Jewish community. A number of my own first cousins in the Ashkenazi Jewish Orthodox community have found their spouses that way. In this religious community, a genetic testing organization founded by Rabbi Josef Ekstein in 1983, after he lost four of his own children to Tay-Sachs, has driven many recessive diseases almost to extinction.41 In many Orthodox religious high schools in the United States and Israel, nearly all teenagers are tested for whether they are carriers of the handful of rare recessive disease-causing mutations that are common in the Ashkenazi Jewish community. If they are carriers, they are never introduced by matchmakers to other teenagers carrying the same mutation. There is every opportunity to do the same in India, but instead of affecting a few hundred thousand people, in India the approach could have an impact on hundreds of millions.
A Tale of Two Subcontinents:
The Parallel History of India and Europe
Up until 2016, the genetic studies of Indian groups focused on the ANI and the ASI: the two populations that mixed in different proportions to produce the great diversity of endogamous groups still living in India today.
But this changed in 2016, when several laboratories, including mine, published the first genome-wide ancient DNA from some of the world’s earliest farmers, people who lived between eleven thousand and eight thousand years ago in present-day Israel, Jordan, Anatolia, and Iran.42 When we studied how these early farmers of the Near East were related to people living today, we found that present-day Europeans have strong genetic affinity to early farmers from Anatolia, consistent with a migration of Anatolian farmers into Europe after nine thousand years ago. Present-day people from India have a strong affinity to ancient Iranian farmers, suggesting that the expansion of Near Eastern farming eastward to the Indus Valley after nine thousand years ago had as important an impact on the population of India.43 But our studies also revealed that present-day people in India have strong genetic affinities to ancient steppe pastoralists. How could the genetic evidence of an impact of an Iranian farming expansion on the population of India be reconciled with the evidence of steppe expansions? The situation was reminiscent of what we had found a couple of years before in Europe, where today’s populations are a mixture not just of indigenous hunter-gatherers and migrant farmers, but also of a third major group with an origin in the steppe.
To gain some insight, Iosif Lazaridis in my laboratory wrote down mathematical models for present-day Indian groups as mixtures of populations related to Little Andaman Islanders, ancient Iranian farmers, and ancient steppe peoples. What he found is that almost every group in India has ancestry from all three populations.44 Nick Patterson then combined the data from almost 150 present-day Indian groups to come up with a unified model that allowed him to obtain precise estimates of the contribution of these three ancestral populations to present-day Indians.
When Patterson inferred what would have been expected for a population of entirely ANI ancestry—one with no Andamanese-related ancestry—he determined that they would be a mixed population of Iranian farmer–related ancestry and steppe pastoralist–related ancestry. But when he inferred what would have been expected for a population of entirely ASI ancestry—one with no Yamnaya-related ancestry—he found that they too must have had substantial Iranian farmer–related ancestry (the rest being Little Andamanese–related).
This was a great surprise. Our finding that both the ANI and ASI had large amounts of Iranian-related ancestry meant that we had been wrong in our original presumption that one of the two major ancestral populations of the Indian Cline had no West Eurasian ancestry. Instead, people descended from Iranian farmers made a major impact on India twice, admixing both into the ANI and the ASI.
Patterson proposed a major revision to our working model for deep Indian history.45 The ANI were a mixture of about 50 percent steppe ancestry related distantly to the Yamnaya, and 50 percent Iranian farmer–related ancestry from the groups the steppe people encountered as they expanded south. The ASI were also mixed, a fusion of a population descended from earlier farmers expanding out of Iran (around 25 percent of their ancestry), and previously established local hunter-gatherers of South Asia (around 75 percent of their ancestry). So the ASI were not likely to have been the previously established hunter-gatherer population of India, and instead may have been the people responsible for spreading Near Eastern agriculture across South Asia. Based on the high correlation of ASI ancestry to Dravidian languages, it seems likely that the formation of the ASI was the process that spread Dravidian languages as well.
Figure 18. Both South Asia and Europe were affected by two successive migrations. The first migration was from the Near East after around nine thousand years ago (1), which brought farmers who mixed with local hunter-gatherers. The second migration was from the steppe after around five thousand years ago (2), which brought pastoralists who probably spoke Indo-European languages, who then mixed with the local farmers they encountered along the way. Mixtures of these mixed groups then formed two gradients of ancestry: one in Europe, and one in India.
These results reveal a remarkably parallel tale of the prehistories of two similarly sized subcontinents of Eurasia—Europe and India. In both regions, farmers migrating from the core region of the Near East after nine thousand years ago—in Europe from Anatolia, and in India from Iran—brought a transformative new technology, and interbred with the previously established hunter-gatherer populations to form new mixed groups between nine thousand and four thousand years ago. Both subcontinents were then also affected by a second later major migration with an origin in the steppe, in which Yamnaya pastoralists speaking an Indo-European language mixed with the previously established farming population they encountered along the way, in Europe forming the peoples associated with the Corded Ware culture, and in India eventually forming the ANI. These populations of mixed steppe and farmer ancestry then mixed with the previously established farmers of their respective regions, forming the gradients of mixture we see in both subcontinents today.
The Yamnaya—who the genetic data show were closely related to the source of the steppe ancestry in both India and Europe—are obvious candidates for spreading Indo-European languages to both these subcontinents of Eurasia. Remarkably, Patterson’s analysis of population history in India provided an additional line of evidence for this. His model of the Indian Cline was based on the idea of a simple mixture of two ancestral populations, the ANI and ASI. But when he looked harder and tested each of the Indian Cline groups in turn for whether it fit this model, he found that there were six groups that did not fit in the sense of having a higher ratio of steppe-related to Iranian farmer–related ancestry than was expected from this model. All six of these groups are in the Brahmin varna—with a traditional role in society as priests and custodians of the ancient texts written in the Indo-European Sanskrit language—despite the fact that Brahmins made up only about 10 percent of the groups Patterson tested. A natural explanation for this was that the ANI were not a homogeneous population when they mixed with the ASI, but instead contained socially distinct subgroups with characteristic ratios of steppe to Iranian-related ancestry. The people who were custodians of Indo-European language and culture were the ones with relatively more steppe ancestry, and because of the extraordinary strength of the caste system in preserving ancestry and social roles over generations, the ancient substructure in the ANI is evident in some of today’s Brahmins even after thousands of years. This finding provides yet another line of evidence for the steppe hypothesis, showing that not just Indo-European languages, but also Indo-European culture as reflected in the religion preserved over thousands of years by Brahmin priests, was likely spread by peoples wh
ose ancestors originated in the steppe.
The picture of population movements in India is still far less crisp than our picture of Europe because of the lack of ancient DNA from South Asia. An outstanding mystery is the ancestry of the peoples of the Indus Valley Civilization, who were spread across the Indus Valley and parts of northern India between forty-five hundred to thirty-eight hundred years ago, and were at the crossroads of all these great ancient movements of people. We have yet to obtain ancient DNA from the people of the Indus Valley Civilization, but multiple research groups, including mine, are pursuing this as a goal. At a lab meeting in 2015, the analysts in our group went around the table placing bets on the likely genetic ancestry of the Indus Valley Civilization people, and the bets were wildly different. At the moment, three very different possibilities are still on the table. One is that Indus Valley Civilization people were largely unmixed descendants of the first Iranian-related farmers of the region, and spoke an early Dravidian language. A second possibility is that they were the ASI—already a mix of people related to Iranian farmers and South Asian hunter-gatherers—and if so they would also probably have spoken a Dravidian language. A third possibility is that they were the ANI, already mixed between steppe and Iranian farmer–related ancestry, and thus would instead likely have spoken an Indo-European language. These scenarios have very different implications, but with ancient DNA, this and other great mysteries of the Indian past will soon be resolved.
7
In Search of Native American Ancestors