by Pbo, Svante
Still, the Neanderthal genome was a tool that would allow us to begin to ask questions about what set Neanderthals and humans apart—a tool that not only we but all future generations of biologists and anthropologists would be able to use. The first step was obviously to make a catalog of all the genetic changes that happened in the ancestors of people living today after they separated from the ancestors of the Neanderthals. These changes would be many, and most of them would be without great consequences, but the crucial genetic events that we were interested in would be hidden among them.
The crucial task of making the first version of such a catalog of all changes unique to modern humans was taken on by Martin Kircher together with his supervisor Janet Kelso. Ideally, such a catalog should contain all genetic changes that are present today in all or nearly all humans and that occurred after modern humans parted ways with the ancestors of Neanderthals. The catalog should thus list positions in the genome where the Neanderthal looked like the chimpanzee and other apes while all humans, no matter where they lived on the planet, differed from the Neanderthals and the apes. However, in 2009 there were many limitations to how complete and correct such a catalog could be. First of all, we had sequenced only about 60 percent of the Neanderthal genome so the catalog could only be 60 percent complete. Second, even if we saw a difference from the human reference genome at a position where the Neanderthal genome looked like the chimpanzee genome, this did not necessarily mean that all humans today looked like the human reference genome. In fact, most such positions would vary among humans, but our knowledge about genetic variation among humans was too incomplete to differentiate real finds and false positives. Fortunately, there were several large projects under way aimed at describing the extent of genetic variation among humans, including the 1,000 Genomes Project, the goal of which was detecting all variants in the human genome present in 1 percent or more of humans. But that project was just starting. A third apparent limitation was that our genome was a composite of sequences from only three Neanderthals, and for most positions, we had only the sequence of a single Neanderthal individual. However, I didn’t view this as overly problematic. As long as one single Neanderthal had the ape-like, ancestral version at a given position, it didn’t matter if other Neanderthals that we hadn’t sequenced carried the derived, new version that we saw in humans today. The knowledge that the ancestral variant was in at least one Neanderthal told us that it had still been around when Neanderthals and modern humans parted ways, perhaps 400,000 years ago. This made it a potential candidate for defining what might be universally modern human.
Janet and Martin compared the human reference genome with the chimpanzee, orangutan, and macaque genomes and identified all positions where they differed. They then compared all four genomes to our Neanderthal DNA sequences, being careful to compare only those Neanderthal DNA sequences for which we had complete certainty as to where they came from in the genome. They found that we had Neanderthal sequence coverage for 3,202,190 positions where nucleotide changes had occurred on the human lineage. For the vast majority of these positions, the Neanderthals looked like us, which was not surprising, given that we are much more closely related to Neanderthals than to apes. But for 12.1 percent of these positions, the Neanderthal looked like the apes. They then checked whether the ancestral variants seen in apes and Neanderthals were still present in some humans today; in most cases they found both the ancestral and the new variants in present-day humans. This was not surprising because the mutations responsible happened quite recently. But some of these new variants were, as far as we could tell, present in all humans today. These were the positions that we found particularly interesting.
Most tantalizing were those changes that might have functional consequences. First and foremost among these were the ones that change amino acids in proteins. Proteins, of course, are encoded by stretches of DNA sequence in the genome called “genes.” Proteins are made up of strings of twenty different amino acids and perform many jobs in our bodies, such as regulating the activity of genes, building up tissues, and controlling our metabolism. As a result, changes in proteins are more likely to have consequences for an organism than a mutation randomly chosen from the set of all the mutations we identified. Such potentially meaningful mutations—which result in one amino acid in a protein being replaced by another one, or change how long a protein is—occur much less often during evolution than nucleotide substitutions that do not cause such dramatic alterations. Ultimately, Martin showed me a list of 78 amino-acid-altering nucleotide positions where, as far as we knew, all humans today were similar to one another but different from the Neanderthal genome and the apes. We expected to both add and subtract mutations from this list as both the Neanderthal genome and the 1,000 Genomes Project neared completion. So an educated guess might be that the total number of amino-acid changes that had spread to all modern humans since we separated from Neanderthals would be less than 200.
In the future, when we’ll have a much fuller understanding of how each protein influences our bodies and minds, biologists will often be able to affix a function to a particular amino acid in a protein and to identify whether it functioned the same way in Neanderthals. Unfortunately, such a comprehensive knowledge of our genome and biology will likely be achieved only long after I have joined the Neanderthals in death. However, I take some solace in the thought that the Neanderthal genome (and the improved versions of it that we and others will achieve in the future) will be a crucial contribution to this endeavor.
For the moment, though, the 78 amino-acid positions provided us with very few and only the very crudest of insights. Just looking at what the changes were gave us very little idea about what might have changed in the biology of the first individual to carry the new variant. However, one thing we did notice was that there were five proteins that each carried not just one but two amino-acid differences. This was very unlikely to have occurred by chance if a total of 78 mutations were to be randomly scattered among the 20,000 proteins encoded by the genome. These five proteins may therefore have altered their functions recently in human history. It is even possible that they lost their function or importance so that they were now free to accumulate changes unhindered by any constraints imposed upon them by their function. Either way, we knew we had to take a closer look at these five proteins.
The first protein with two changes was involved in sperm motility. I was not very surprised by this. Among human and nonhuman primates alike, genes involved in male reproduction and sperm motility have been known to frequently change, probably due to direct competition between sperm cells from different males when females copulated with multiple partners. This overt competition means that any genetic change that makes a sperm cell more likely to fertilize the egg than its competitors, perhaps by swimming faster, will spread in the population. Such a change is considered to be under positive selection, because it increases the chance the individual with the mutation will leave progeny in the next generation. In fact, the more direct competition there is between sperm cells from different males in a single female (head-to-head, so to speak), the more positive selection can act. So there is a correlation between the level of promiscuity in a species and the extent to which positive selection can be detected in genes that have to do with male reproduction. Among chimpanzees, where a female in estrus tends to copulate with all males that happen to be available to her, there is more evidence for positive selection on such genes than among gorillas, where one dominant, silverback male tends to monopolize all females in his group. The sperm of a patriarchal gorilla silverback have all the time they need to fertilize the egg, since the sperm of younger and subordinate males cannot enter into the race. Or rather, the competition has already taken place at an earlier stage on the social level, when the hierarchy in the group was established. Amazingly, even crude measures such as the size of the testicles relative to the body reflect this difference in male competition for fertilizations. Whereas chimpanzees have large testicles, an
d the even more promiscuous but smaller bonobos carry around even more impressive sperm factories, the intimidatingly huge silverback gorillas have puny little testicles. Humans, as measured both by testicle size and evidence for positive selection on genes relevant for male reproduction, seem to be somewhere between the extremes of chimpanzee promiscuity and gorilla monogamy, suggesting that our ancestors may have been not so unlike us, vacillating between emotionally rewarding fidelity to a partner and sexually alluring alternatives.
The second protein on Martin’s list that carried two changes had no known function—a reflection of our woefully inadequate knowledge of what genes do. A third one was involved in the synthesis of molecules necessary to produce proteins in the cells. I had no clue what that might mean, and wondered whether the gene actually had additional functions that were unknown to us—not at all an unlikely possibility given our limited knowledge about the function of genes. But the two remaining proteins with two amino-acid changes were both present in skin—one was involved in how cells attach to one another, particularly while wounds are healing, and the other was present in the upper layers of the skin, in certain sweat glands, and in hair roots. This suggested that something in the skin had changed during the course of recent human evolution. Perhaps future work will show that the former protein has something to do with the tendency for wounds to heal faster in apes than in humans, and that the latter has something to do with our lack of fur. But for the time being it is just not possible to tell. We are simply too ignorant about how genes affect the ways our bodies work.
A future version of Martin’s and Janet’s catalog, based on a complete version of the Neanderthal genome and more knowledge about genetic variation in people today, will contain positions in the human genome that changed between perhaps 400,000 years ago, when our ancestors parted ways with Neanderthals and then spread to become present in all humans, and about 50,000 years ago, when the “replacement crowd” fanned out across the globe, on the other. After that time, no further changes could be established in all humans simply because humans were spread out across continents. Based on the numbers we obtained using the parts of the Neanderthal genome we had, we estimated that the total number of DNA sequence positions at which the Neanderthals differed from all humans today will be on the order of 100,000. This will represent an essentially complete answer to the question of what makes modern humans “modern,” at least from a genetic perspective. If in an imaginary experiment one were to change each of these 100,000 nucleotides back to their ancestral state in a modern human, the result would be an individual who, in a genetic sense, was similar to the common ancestor of Neanderthals and modern humans. In the future, one of the most important research objectives in anthropology will be to study this catalog in order to identify those genetic changes that are of relevance for how modern humans think and behave.
Chapter 21
Publishing the Genome
________________________________
In science, very few results are definitive. In fact, soon after arriving at an insight, often after great effort, one can generally foresee imminent developments that will improve upon it. Yet at some point, it is necessary to draw a line and say that the time has come to publish. In the fall of 2009, I felt that we had reached that point.
The paper that we were going to write would be a milestone in several ways. Above all else, it was the first genome sequenced from an extinct form of humans. True, Eske Willerslev’s group in Copenhagen, Denmark, had published a genome from a lock of Eskimo hair that spring. But the lock of hair was just 4,000 years old and had been preserved in the permafrost, and 80 percent of its DNA was human. The title of their paper said that they had sequenced an “extinct Palaeo-Eskimo,” although I wondered what present-day Eskimos thought about the contention that they were extinct. The Neanderthals were truly old, truly extinct, a different form of humans, and of crucial evolutionary importance as the closest relative of all present-day humans, no matter where they live on the planet. I also felt that we had set the technical stage for much future work; unlike carcasses preserved in permafrost, the bones we had used hadn’t been preserved in extraordinary ways. They were similar to thousands of human and animal bones found in caves in many parts of the world. I hoped that the techniques we had developed could now be used to recover whole genomes from many such remains. The finding most likely to create controversy was that Neanderthals had contributed parts of their genome to present-day people in Eurasia. But since we had come to this conclusion three times using three different approaches, I felt that we had definitively laid this question to rest. Future work would surely clarify the details of when, where, and how it had happened, but we had definitively shown that it had happened. The time had come for us to present our results to the world.
My ambition was to write a paper that would be as understandable as possible to a wide audience since not only geneticists would be interested in what we had done, but also archaeologists, paleontologists, and others. In fact, I was getting pressure from various directions to publish our findings. The Science editor was asking me when the paper would be submitted, and journalists kept calling not only me but other members of the team to ask when we would publish. I was starting to feel increasingly embarrassed about giving scientific talks that were focused more on technical issues than on what the genome told us, even though everybody realized that by now we must have interesting results to report. Despite the pressure, I felt that it was crucial to keep our main findings secret until publication. I worried that one of the fifty or so people in the know would tell a journalist that we had found evidence of Neanderthal gene flow in present-day people. If that happened, the news would quickly be all over the media.
An additional recurring worry was that another group would publish Neanderthal sequences before we did. This second worry was of course focused on one particular person: our previous partner and current competitor, Eddy Rubin at Berkeley, whom we knew had access to Neanderthal bones and the resources necessary to work on them. I thought about all the efforts expended by everyone involved in this project over the past four years and imagined what it would feel like to wake up to newspaper headlines saying that Neanderthals had contributed genes to people today, based on perhaps ten times less data than we had, analyzed in haste. Quite uncharacteristically, I even found myself fretting about this as I tried to fall asleep at night.
It was impossible to hide my worries during our weekly phone meetings. I started to reiterate that no one was allowed to say anything about any aspect of our results to the press, however pushy a journalist might be. That not a single consortium member ever did so is testimony to the loyalty of the entire team. I also started to pressure everyone in the consortium to deliver descriptions of what they had done. This was less easy for them to achieve. Some scientists are so driven by intellectual curiosity that once they’ve found the solution to a problem, they will be remiss in going through the tedium of writing it up and publishing it. This, of course, is very bad. Not only does the public, which has ultimately funded the research, have a right to learn about the results, but other scientists also need to know the details of how results were achieved so that they can improve and build on them. In fact, this is the main reason why, when scientists are being considered for appointments and promotions, they’re judged not on how many interesting projects they have started but, instead, on how many projects they have finished and published. Some members of the consortium delivered their texts quickly, some slowly and in a preliminary form, and some not at all. I thought about how to pressure even distinguished colleagues to deliver their write-ups and finally came up with an idea: I needed to take advantage of their vanity.
Most scientists, like most people, want recognition for a job well done. They thrive on how often their papers are cited in other publications and how many invitations they get to deliver lectures. Apportioning the credit in our case would be difficult. Several groups and more than fifty scientists had contributed to
our project and would appear as authors on the paper, and it would be hard to attribute credit to individuals for each of the different, often very creative and laborious analyses that had been done. In spite of this, everybody had worked selflessly toward the common goal, but it seemed only fair to apportion some individual credit. The question I faced was how to do that, and in the process also stimulate people to write quicker and well.
As is typical of many large scientific papers, most results presented in our paper would be presented as so-called supplementary material that wouldn’t be included in the print journal but would instead be published electronically on the journal’s website. The bulk of this considerable pile of material would be the technical minutiae interesting only to the experts. Normally, the authors of the supplementary material are the same and appear in the same order as on the paper. I decided to change that. I suggested that each section of this supplementary material would have separate authors and include a corresponding author to whom any interested readers would be referred in case they had questions. This system would make much clearer who had done which experiments and analyses. It would also make each person personally responsible for the quality of the section, as any glory—or any blame—would be directed at least partly to him or her. To further improve the quality, we assigned one member of the consortium not involved in that particular aspect of the work to carefully read such supplementary sections in order to find errors and faults in the presentation. This all helped. People actually delivered their supplementary sections, which eventually swelled to 19 chapters and 174 pages. My task became to modify these sections and write the main text that would be printed in the journal. In this, the ever energetic David Reich was a great help. There was much e-mailing about changes to the text of the main paper but finally, in the first days of February 2010, Ed Green submitted everything to Science.