Who We Are and How We Got Here
Page 1
Copyright © 2018 by David Reich and Eugenie Reich
All rights reserved. Published in the United States by Pantheon Books, a division of Penguin Random House LLC, New York, and distributed in Canada by Random House of Canada, a division of Penguin Random House Canada Limited, Toronto.
Pantheon Books and colophon are registered trademarks of Penguin Random House LLC.
Library of Congress Cataloging-in-Publication Data
Name: Reich, David [date], author.
Title: Who we are and how we got here : ancient DNA and the new science of the human past / David Reich.
Description: First edition. New York : Pantheon Books, [2018]. Includes bibliographical references and index.
Identifiers: LCCN 2017038165. ISBN 9781101870327 (hardcover). ISBN 9781101870334 (ebook).
Subjects: LCSH: Human genetics—Popular works. Genomics—Popular works. DNA—Analysis. Prehistoric peoples. Human population genetics. BISAC: SCIENCE/Life Sciences/Genetics & Genomics. SCIENCE/Life Sciences/Evolution. SOCIAL SCIENCE/Anthropology/General.
Classification: LCC QH431 .R37 2018. DDC 572.8/6—dc23.
LC record available at lccn.loc.gov/2017038165.
Ebook ISBN 9781101870334
www.pantheonbooks.com
Cover design by Oliver Uberti
Illustrations and map by Oliver Uberti
v5.2
a
For Seth and Leah
Contents
Cover
Title Page
Copyright
Dedication
Map
Acknowledgments
Introduction
Part I The Deep History of Our Species
1 How the Genome Explains Who We Are
2 Encounters with Neanderthals
3 Ancient DNA Opens the Floodgates
Part II How We Got to Where We Are Today
4 Humanity’s Ghosts
5 The Making of Modern Europe
6 The Collision That Formed India
7 In Search of Native American Ancestors
8 The Genomic Origins of East Asians
9 Rejoining Africa to the Human Story
Part III The Disruptive Genome
10 The Genomics of Inequality
11 The Genomics of Race and Identity
12 The Future of Ancient DNA
Notes on the Illustrations
Notes
About the Author
30 Population Mixtures
The mixture of highly differentiated populations is a recurrent process in our history. This map provides a key to thirty great mixture events discussed in this book. (Locations are not meant to be precise.)
CHAPTER 2
2a 54,000–49,000 years ago
All non-Africans
Neanderthals + modern humans
CHAPTER 3
3a >70,000 ya Siberian Denisovans
Superarchaic lineage +
Neanderthal-related lineage
3b 49,000–44,000 ya
Papuans and Australians
Denisovans + modern humans
CHAPTER 4
4a 19,000–14,000 ya
Magdalenian expansion
Aurignacian + Gravettian lineages
4b >14,000 ya
Late Near Eastern hunter-gatherers
Basal Eurasians + early Near Eastern hunter-gatherers
4c ~14,000 ya
Bølling-Allerød expansion
Southwest + Southeast European hunter-gatherers
4d 8,000–3,000 ya
Copper and Bronze Age Near East
Iranian + Levantine + Anatolian farmers
CHAPTER 5
5a 9,000–5,000 ya
First European farmers
Local hunter-gatherers + Anatolian farmers
5b 9,000–5,000 ya
Steppe pastoralists
Iranian farmers + local hunter-gatherers
5c 5,000–4,000 ya
Northern European Bronze Age
Eastern European farmers
+ steppe pastoralists
5d >3,500 ya
Aegean Bronze Age
Iranian farmers + European farmers
5e 3,500 ya – present
Present-day Europeans
Northern + Southern European Bronze Age populations
CHAPTER 6
6a >4,000 ya
Ancestral South Indians
Iranian farmers + indigenous
Indian hunter-gatherers
6b 4,000–3,000 ya
Ancestral North Indians
Steppe pastoralists + Iranian farmers
6c 4,000–2,000 ya
Present-day Indians
Ancestral South Indians + Ancestral North Indians
CHAPTER 7
7a >15,000 ya
First Americans
Ancient North Eurasians + East Asians
7b 5,000–4,000 ya
Paleo-Eskimos
Far Eastern Siberians + First Americans
7c >4,000 ya
Amazonians
Population Y + First Americans
7d 2,000–1,000 ya
Na-Dene speakers
Paleo-Eskimos + First Americans
7e 2,000–1,000 ya
Neo-Eskimos
Far Eastern Siberians + First Americans
CHAPTER 8
8a 5,000–4,000 ya Austroasiatic speakers
Yangtze River Ghost Population + indigenous Southeast Asian hunter-gatherers
8b 5,000–3,000 ya
Tibetans
Yellow River Ghost Population + Tibetan hunter-gatherers
8c 5,000–1,000 ya Present-day Han Chinese
Yellow + Yangtze River Ghost Populations
8d 4,000–1,000 ya
Southwest Pacific islanders
Papuans + East Asians
8e 3,000–2,000 ya Present-day Japanese
Mainland farmers + local hunter-gatherers
CHAPTER 9
9a >8,000 ya
Malawi hunter-gatherers
East + South African foragers
9b 4,000–1,000 ya
Bantu expansion
Cameroon source population + local groups throughout eastern and southern Africa
9c >3,000 ya
East African pastoralists
Levantine farmers + East African foragers
9d >2,000 ya
Present-day West Africans
At least two ancient African lineages
9e 2,000–1,000 ya
Present-day Khoe-Kwadi herders
East African pastoralists + indigenous San
Acknowledgments
First thing first. This book emerged out of a year of intense collaboration with my wife, Eugenie Reich. We researched the book together, prepared the first drafts of the chapters together, and talked about the book incessantly as it matured. This book would not have come into being without her.
I am grateful to Bridget Alex, Peter Bellwood, Samuel Fenton-Whittet, Henry Louis Gates Jr., Yonatan Grad, Iosif Lazaridis, Daniel Lieberman, Shop Mallick, Erroll McDonald, Latha Menon, Nick Patterson, Molly Przeworski, Juliet Samuel, Clifford Tabin, Daniel Reich, Tova Reich, Walter Reich, Robert Weinberg, and Matthew Spriggs for close critical readings of the entire book.
I thank David Anthony, Ofer Bar-Yosef, Caroline Bearsted, Deborah Bolnick, Dorcas Brown, Katherine Brunson, Qiaomei Fu, David Goldstein, Alexander Kim, Carles Lalueza-Fox, Iain Mathieson, Eric Lander, Mark Lipson, Scott MacEachern, Richard Meadow, David Meltzer, Priya Moorjani, John Novembre, Svante Pääbo, Pier Palamara, Eleftheria Palkopoulou, Mary Prendergast, Rebecca Reich, Colin Renfrew, Nadin Rohland, Daniel Rozas, Pontus Skoglund, Chuanchao Wang, and Michael Wi
tzel for critiques of individual chapters. I also thank Stanley Ambrose, Graham Coop, Dorian Fuller, Éadaion Harney, Linda Heywood, Yousuke Kaifu, Kristian Kristiansen, Michelle Lee, Daniel Lieberman, Michael McCormick, Michael Petraglia, Joseph Pickrell, Stephen Schiffels, Beth Shapiro, and Bence Viola for reviewing sections of the book for accuracy.
I am grateful to Harvard Medical School, the Howard Hughes Medical Institute, and the National Science Foundation, all of which generously supported my science while I was working on this project, and viewed it as complementary to my primary research.
I finally thank several people who repeatedly encouraged me to write this book. I resisted the idea for years because I did not want to distract myself from my science, and because for geneticists papers are the currency, not books. But my mind changed as my colleagues grew to include archaeologists, anthropologists, historians, linguists, and others eager to come to grips with the ancient DNA revolution. There are many papers I did not write, and many analyses I did not complete, because of the time I needed to write this book. I hope that those who read the book will emerge with a new perspective on who we are.
Introduction
This book is inspired by a visionary, Luca Cavalli-Sforza, the founder of genetic studies of our past. I was trained by one of his students, and so it is that I am part of his school, inspired by his vision of the genome as a prism for understanding the history of our species.
The high-water mark of Cavalli-Sforza’s career came in 1994 when he published The History and Geography of Human Genes, which synthesized what was then known from archaeology, linguistics, history, and genetics to tell a grand story about how the world’s peoples got to be the way they are today.1 The book offered an overview of the deep past. But it was based on what was known at the time and was therefore handicapped by the paucity of genetic data then available, which were so limited as to be nearly useless compared to the far more extensive information from archaeology and linguistics. The genetic data of the time could sometimes reveal patterns consistent with what was already known, but the information they provided were not rich enough to demonstrate anything truly new. In fact, the few major new claims that Cavalli-Sforza did make have essentially all been proven wrong. Two decades ago, everyone, from Cavalli-Sforza to beginning graduate students such as myself, was working in the dark ages of DNA.
Cavalli-Sforza made a grand bet in 1960 that would drive his entire career. He bet that it would be possible to reconstruct the great migrations of the past based entirely on the genetic differences among present-day peoples.2
Through study after study over the subsequent five decades, Cavalli-Sforza seemed to be well on the path to making good on his bet. When he started his work, the technology for studying human variation was so poor that the only possibility was to measure proteins in the blood, using variations like the A, B, and O blood types that are tested by physicians to match blood donors to recipients. By the 1990s, he and his colleagues had assembled data from more than one hundred such variations in diverse populations. Using these data they were able to reliably cluster individuals by continent based on how often they matched each other at these variations: for example, Europeans have a high rate of matching to other Europeans, East Asians to East Asians, and Africans to Africans. In the 1990s and 2000s, they brought their work to a new level by moving beyond protein variation and directly examining DNA, our genetic code. They analyzed a total of about one thousand individuals from around fifty populations spread across the planet, examining variation at more than three hundred positions in the genome.3 When they told their computer—which had no knowledge of the population labels—to cluster the individuals into five groups, the results corresponded uncannily well to commonly held intuitions about the deep ancestral divisions among humans (West Eurasians, East Asians, Native Americans, New Guineans, and Africans).
Cavalli-Sforza was especially interested in interpreting the genetic clusters among present-day people in terms of population history. He and his colleagues analyzed their blood group data by using a technique that identifies combinations of biological variations that are most efficient at summarizing differences across individuals. Plotting these combinations of blood group types onto a map of West Eurasia, they found that the one summarizing the most variation across individuals reached its extreme value in the Near East, and declined along a southeast-to-northwest gradient into Europe.4 They interpreted this as a genetic footprint of the migration of farmers into Europe from the Near East, known from archaeology to have occurred after nine thousand years ago. The declining intensity suggested to them that after arriving in Europe, the first farmers mixed with local hunter-gatherers, accumulating more hunter-gatherer ancestry as they expanded, a process they called “demic diffusion.”5 Until recently, many archaeologists viewed the demic diffusion model as an exemplary merging of insights from archaeology and genetics.
The model that Cavalli-Sforza and colleagues proposed to describe the data was intellectually attractive, but it was wrong. Its flaws became apparent beginning in 2008, when John Novembre and colleagues demonstrated that gradients like those observed in Europe can arise without migration.6 They then showed that a Near Eastern farming expansion into Europe might counter-intuitively cause the mathematical technique that Cavalli-Sforza used to produce a gradient perpendicular to the direction of migration, not parallel to it as had been seen in the real data.7
It took the revolution wrought by the ability to extract DNA from ancient bones—the “ancient DNA revolution”—to drive a nail into the coffin of the demic diffusion model. The ancient DNA revolution documented that the first farmers even in the most remote reaches of Europe—Britain, Scandinavia, and Iberia—had very little hunter-gatherer-related ancestry. In fact, they had less hunter-gatherer ancestry than is present in diverse European populations today. The highest proportion of early farmer ancestry in Europe is today not in Southeast Europe, the place where Cavalli-Sforza thought it was most common based on the blood group data, but instead is in the Mediterranean island of Sardinia to the west of Italy.8
The example of Cavalli-Sforza’s maps shows why his Sforza’s grand bet went sour. He was correct in his assumption that the present-day genetic structure of populations echoes some of the great events in the human past. For example, the lower genetic diversity of non-Africans compared to Africans reflects the reduced diversity of the modern human population that expanded out of Africa and the Near East after around fifty thousand years ago. But the present-day structure of human populations cannot recover the fine details of ancient events. The problem is not just that people have mixed with their neighbors, blurring the genetic signatures of past events. It is actually far more difficult, in that we now know, from ancient DNA, that the people who live in a particular place today almost never exclusively descend from the people who lived in the same place far in the past.9 Under these circumstances, the power of any study that attempts to reconstruct past population movements from present-day populations is limited. In The History and Geography of Human Genes, Cavalli-Sforza wrote that he was excluding from his analysis populations known to be the product of major migrations, such as those of European and African ancestry in the Americas that owe their origin to transatlantic migrations in the last five hundred years, or European minorities such as Roma and Jews. His bet was that the past was a much simpler place than the present, and that by focusing on populations today that are not affected by major migrations in their recorded history, he might be studying direct descendants of people who lived in the same places long before. But what the study of ancient DNA has now shown is that the past was no less complicated than the present. Human populations have repeatedly turned over.
Figure 1a. A contour plot made by Luca Cavalli-Sforza in 1993 (adapted above) suggested that the movement of farmers from the east could be reconstructed from the patterns of blood group variation among people living today, with the highest proportions of such ancestry in the southeast near Anatolia.
&n
bsp; Cavalli-Sforza’s transformative contribution to the field of genetic studies of human prehistory recalls the story of Moses, a visionary leader whose achievement was greater than that of anyone who followed him and who created a new template for seeing the world. The Bible says, “No prophet ever arose again in Israel like Moses,” but also tells how Moses was not allowed to reach the promised land. After leading his people for forty years through the wilderness, Moses climbed the mountain of Nebo and looked west over the Jordan River to see the land his people had been promised. But he was not allowed to enter that land. That privilege had been reserved for his successors.
Figure 1b. Modern genome-wide data shows that the primary gradient of farmer ancestry in Europe does not flow southeast-to-northwest but instead in an almost perpendicular direction, a result of a major migration of pastoralists from the east that displaced much of the ancestry of the first farmers.
So it is with genetic studies of the past. Cavalli-Sforza saw before anyone else the full potential of genetics for revealing the human past, but his vision predated the technology needed to fulfill it. Today, however, things are very different. We have several hundred thousand times more data, and in addition we have access to the rich lode of information contained in ancient DNA, which has become a more definitive source of information about past population movements than the traditional tools of archaeology and linguistics.
The first five ancient human genomes were published in 2010: a few archaic Neanderthal genomes,10 the archaic Denisova genome,11 and an approximately four-thousand-year-old individual from Greenland.12 The next few years saw the publication of genome-wide data from five additional humans, followed by a burst of data from thirty-eight individuals in 2014. But in 2015, whole-genome analysis of ancient DNA went into hyperdrive. Three papers added genome-wide datasets from another sixty-six,13 then one hundred,14 and then eighty-three samples.15 By August 2017, my laboratory alone had generated genome-wide data for more than three thousand ancient samples. We are now producing data so fast that the time lag between data production and publication is longer than the time it takes to double the data in the field.