by Carl Zimmer
Let that delusion pass.
If you look long enough at E. coli’s genome, you will come across hundreds of pseudogenes, instructions with catastrophic typographical errors. You will encounter the genes of viruses that respond to stress by making new viruses and killing their host. Other instructions are mysteriously clumsy, redundant, and roundabout. Still others are cases of outright plagiarism.
Where the metaphor of an instruction manual collapses, other metaphors can take its place. My favorite is an old battered book that sits today in a museum in Baltimore. It was created in Constantinople in the tenth century. A Byzantine scribe copied the original Greek text of two treatises by the ancient mathematician Archimedes onto pages of sheepskin. In 1229, a priest named Johannes Myronas dismantled the book. He washed the old Greek text from the pages with juice or milk, removed the wooden boards, and cut the binding strings on the spine. Myronas then used the sheepskin to write a Christian prayer book. This sort of recycled book is known as a palimpsest.
Despite its new incarnation, the Archimedes palimpsest carried traces of the original text. The prayer book was passed from church to church, scorched in a fire, splashed with candle wax, freshened up with new illuminations, and colonized by purple fungus. In 1907, a Danish scholar named Johan Heiburg discovered that the battered prayer book was in fact the only surviving copy of Archimedes’ treatises in their original Greek. But with only a magnifying glass to help him, Heilburg could make out very little of the ancient text. A century later conservationists are making more progress. They are illuminating Archimedes’ works with beams of X-rays that light up atoms of iron in the original ink, resurrecting a glowing text of Greek. The palimpsest reveals new depths to the genius of Archimedes, who turns out to have been contemplating calculus and infinity and other concepts that would not be rediscovered for centuries.
E. coli’s genome is not so much a manual as a living palimpsest. E. coli K-12, O157:H7, and all the other strains evolved from a common ancestor that lived dozens of millions of years ago. And that common ancestor itself descended from still older microbes, stretching back over billions of years. The genetic history of E. coli is masked by mutations, duplications, deletions, and insertions. Yet traces of those older layers of text survive in E. coli’s genome, like vestiges of Archimedes.
Until recently, scientists had only crude tools for reading those hidden layers. They struggled like Heiberg with his magnifying glass. They are now getting a much better look at the palimpsest. Like Archimedes’ ancient treatise, they’re finding, E. coli’s genome is a book of wisdom. It offers hints about how life has evolved over billions of years—how complex networks of genes emerge, how evolution can act like an engineer without an engineer’s brain. Nested within E. coli’s genome are clues to the earliest stages of life on Earth, including the world before DNA. Those clues may someday help guide scientists to the origins of life itself.
THE TREE OF LIFE
To read E. coli’s palimpsest, scientists have had to figure out which parts of its genome are new and which are old. The answer can be found in the genealogy of germs. A family tree of the living strains of E. coli indicates that they all descend from a common ancestor that lived some 10 million to 30 million years ago. Even farther back, E. coli shares an ancestor with other species. Reach back far enough, and you ultimately encounter the ancestor E. coli shares with all other living things, ourselves included.
Reconstructing the tree of life—one that includes E. coli and humans and everything else that lives on Earth—has been one of modern biology’s great quests. In 1837, Charles Darwin drew his first version of the tree of life. On a page in his private notebook he sketched a few joined branches, each with a letter at its tip representing a species. Across the top of the page he wrote, “I think.”
The fact that species have common ancestors explains why they share many traits. As different as bats and humans may seem, we are both hairy, warm-blooded, five-fingered mammals. Darwin himself did not try to figure out exactly how all the species alive were related to one another, but within a few years of the publication of The Origin of Species, other naturalists did. The German biologist Ernst Haeckel produced gorgeous illustrations of trees sprouting graceful bark-covered boughs. His trees were accurate in many ways, scientists would later find. But Haeckel marred them with a stupendous anthropocentrism. To Haeckel, the history of life was primarily the history of our own species. His tree looked like a plastic Christmas tree, with branches sticking out awkwardly from a central shaft. He labeled the base of the tree Moneran, the name he used for bacteria and other single-celled organisms. Farther up the tree were branches representing species more and more like ourselves—sponges, lampreys, mice. And atop the tree sat Menschen.
This view of life has been a hard one to shake. It probably had something to do with the decision to split life into prokaryotes and eukaryotes, the supposedly primordial bacteria and the “advanced” species like ourselves that evolved from them. It’s a deeply flawed view. The evolution of life was not a simple climb from low to high. E. coli is a species admirably adapted to warm-blooded creatures that did not emerge for billions of years after life began. It is as modern as we are.
It took a long time for a more accurate picture of the tree of life to take hold. One major obstacle was the lack of information scientists could use to determine how E. coli is related to other bacteria, or how bacteria are related to us. To compare ourselves to a bat, we can simply use our eyes to study fur, fingers, and other parts of our shared anatomy. Under a microscope, however, many bacteria look like nondescript balls or rods. Microbiologists sometimes classified species of bacteria based on little more than their ability to eat a certain sugar, or the way they turned purple when they were stained with a dye. It was not until the dawn of molecular biology that scientists finally got the tools required to begin drawing the tree of life. Experiments on E. coli helped them to recognize that all living things share the same genetic code, and the same way of passing on genetic information to their descendants. They share these things because they had a common ancestry.
In the 1970s, Carl Woese, a biologist at the University of Illinois, Urbana-Champaign, discovered a way to use those shared molecules to draw a tree of life. Woese and his colleagues teased apart ribosomes, the factories for making proteins, and studied one piece of RNA, known as 16S rRNA. Woese did his work years before scientists could easily read the sequence of RNA or DNA. So he and his colleagues did the next best thing: they sliced up Escherichia coli’s 16S rRNA with the help of a virus enzyme. They then cut up the 16S rRNA of other microbes and gauged how similar their fragments were to those of E. coli. They discovered many regions that were identical, base for base, no matter which species they compared. These regions had not changed over billions of years. The regions that had diverged revealed which species were more closely related than others.
Rough and preliminary as the results were, they upended decades of consensus. The standard classifications of many groups of bacteria turned out to be wrong. Most startling of all, Woese and his colleagues found that a number of bacteria were closer to eukaryotes than to other bacteria. They were not bacteria at all. Woese and his colleagues declared that life formed not two major groups of species but three. They dubbed the third domain of life archaea. “We are for the first time beginning to see the overall phylogenetic structure of the living world,” Woese and his colleagues declared.
Over the next thirty years, scientists built on Woese’s work, drawing a more detailed picture of the tree of life. They studied ribosomal RNA in more species. They found other genes that also made for good comparisons. They used new statistical methods that gave them more confidence in their results. They found many more species of archaea, confirming it as a genuine branch of life. Archaea may look superficially like bacteria, but they have some distinctive traits, such as unique molecules that make up their cell walls.
To measure the diversity of life, Woese and his colleagues coun
ted up the mutations to ribosomal RNA that had accumulated in each branch of life. The more mutations, the longer the branch. The new tree was a far cry from Haeckel’s. The animal kingdom became a small tuft of branches nestled in the eukaryotes. Two bacteria that might look identical under a microscope were often separated by a bigger evolutionary gulf than the one that separates us from starfish or sponges. One look at the tree made it clear that the evolutionary history of any individual species of bacterium—E. coli, for example—is a complicated tale.
TREE VERSUS WEB
In the 1980s, some experts on the tree of life became worried. It was slowly becoming clear that horizontal gene transfer was not just a peculiarity of E. coli’s laboratory sex or the modern era of antibiotics. Genes had moved from species to species long before humans had begun to tinker with life. If genes moved too often, some scientists feared, they might make it impossible to reconstruct the tree’s branches.
To reconstruct the tree of life, scientists compare DNA from different species and come up with the most likely pattern of branches that could have produced the differences. A genetic marker shared by two species might reveal that they had a close common ancestry, one not shared by species that lack the marker. But those markers make sense only if life passes down all its genes from one generation to the next. If a gene slips from one species to another, it can create an illusion of kinship that’s not actually there.
At first, most scientists dismissed this sort of fretting. Over the course of billions of years, horizontal gene transfers were inconsequential. To find the true tree of life, scientists assumed they just had to avoid those rare swapped genes.
In later years it became possible to get a better sense of how much horizontal gene transfer has occurred by comparing genomes. The genomes of humans and other animals didn’t show much evidence of recently transferred genes. That’s not too surprising when you consider how we reproduce. Only a few cells in an animal—eggs and sperm cells—have a chance to become a new organism. And these cells have very little contact with other species that might bequeath DNA to them. (The chief exceptions to this rule are the thousands of viruses that have inserted themselves in our genomes.) But in this respect, animals were oddities. Bacteria, archaea, and single-celled eukaryotes turned out to have traded genes with surprising promiscuity. And those traded genes, some scientists argued, posed a serious threat to the dream of drawing the full, true tree of life.
W. Ford Doolittle, a biologist at Dalhousie University in Halifax, Nova Scotia, illustrated the seriousness of the threat in an article in Scientific American in 2000. The article includes a picture of two trees. The first shows the tree of life as revealed by ribosomal RNA, with bacteria, archaea, and eukaryotes branching off in an orderly fashion from a common ancestor. The second shows what the history of life might really look like: a tree emerging from a mangrovelike network of roots, with branches fused into a tangle of shoots. Parts of it look less like a tree than a web.
As with most scientific debates in biology, the tree-versus-web debate is not an all-or-nothing battle. The web champions, such as Doolittle, don’t deny that organisms are related to one another by common descent. They just think that searching for one true tree of life by comparing genes is a futile quest. The tree champions do not deny that horizontal gene transfer happens or that it is biologically important. They simply argue that the right genes can reveal the true relationships among all living things on Earth.
As scientists have begun to compare the entire genomes of many species for the first time, a number of them have decided that the tree of life still stands. Howard Ochman came to this conclusion on the basis of a survey he and his colleagues made of E. coli and a dozen other species of bacteria. The scientists found a number of genes that showed signs of having moved by horizontal transfer. But most of those genes had moved relatively recently—only after each species in their study had branched off from the others.
Horizontal gene transfer is common, the scientists found, but the genes usually don’t survive very long in their hosts. Many of them become disabled by mutations, turning into pseudogenes. Eventually, other mutations slice the genes out of their genomes completely, and the bacteria suffers no ill effects from the loss. A few genes ferried into the ancestors of E. coli and other bacteria did manage to establish themselves and can still be found in many living species today. But in order to avoid oblivion, they seem to have abandoned their wandering ways. Once a virus inserted them into a host genome, they did not leave it again. Ochman and his colleagues concluded that even though genes regularly move between the branches of life, the branches remain distinct.
THE ROAD TO ESCHERICHIA
The newest versions of the tree of life look nothing like Haeckel’s Christmas tree. Scientists can now compare thousands of species at once, and the only way to draw all of their branches is to arrange them like the spokes on a wheel. At the center of the wheel is the last common ancestor of all life on Earth today. From the center you can move outward, steering from branch to branch to follow the evolution of a particular lineage. To get to our own species, you first travel up to the common ancestor of archaea and eukaryotes. From there you bear right onto the eukaryote branch. Our ancestors remained single-celled protozoans until about 700 million years ago. They parted ways with the branches that would give rise to multicellular plants and fungi. Eventually the path takes you to the animal kingdom. Bear right again and you follow our ancestors as they become vertebrates. The ancestors of other vertebrates branch off along the way: zebrafish, chickens, mice, chimpanzees. Finally the line ends with Homo sapiens.
But enough about you. A different route travels from the common ancestor to E. coli. The journey is just as long and no less interesting.
The last common ancestor of all living things was probably much simpler than E. coli. While each species today carries some unique genes, it also shares genes found in all other species. These universal genes probably are the legacy of the last common ancestor. A simple search for universal genes brings up a pretty short list, about 200 genes long. The common ancestor probably had a bigger genome, because many genes have been lost over the history of life. Christos Ouzounis and his colleagues at the European Bioinformatics Institute in Cambridge estimate that its full genome contained somewhere between 1,000 and 1,500 genes. Even if Ouzounis is right, however, the last common ancestor of all living things had only a third or a quarter of the genes that a typical strain of E. coli has today.
That last common ancestor did not have early Earth all to itself. It shared the planet with an uncountable number of other microbes. Over time the other branches on the tree of life became extinct while our own survived. The world on which these early microbes lived was profoundly different from our own. Four billion years ago, Earth was regularly devastated by gigantic asteroids and miniature planets. Some of the impacts may have boiled off the oceans. As the water slowly fell back to Earth and grew into seas again, life may have found refuge in cracks in the ocean floor. It may be no coincidence that on the tree of life some of the deepest branches belong to heat-loving species that live in undersea hydrothermal vents.
Once Earth became more habitable, the descendants of the common ancestor fanned out. They spread across the seafloor, growing into lush microbial mats and reefs. Continents swelled up, and early organisms moved ashore, forming crusts and varnishes. Along the way they evolved new ways to feed and grow. Some bacteria and archaea consumed carbon dioxide and used iron or other chemicals from deep-sea vents as a source of energy. They built up a supply of organic carbon that other microbes began to feed on.
E. coli may descend from those ancient scroungers. Its ancestors certainly could not have been living inside humans 3 billion years ago, or inside any other animal for that matter. Some of E. coli’s closest living relatives (a group collectively known as gamma-proteobacteria) offer some clues to what E. coli’s ancestors might have been doing then. Some eat oil that oozes from cracks in the seafloor. Others li
ve on the sides of undersea volcanoes, where they glue themselves to passing bits of proteins. E. coli may have acquired its metabolism from such carbon-scrounging ancestors.
E. coli’s complex social life—forming biofilms, waging wars with colicins, and so on—may have also had its origins in free-living ancestors in the ocean. Aquatic microbes today have intensely social lives, living mainly in biofilms rather than floating alone as individuals.
About 2.5 billion years ago, the ancestors of E. coli were rocked by a planetwide catastrophe: oxygen began to build up in the atmosphere. To us oxygen is essential to life, but on the early Earth it was poison. Initially the planet’s atmosphere was a smoggy mix of molecules, including heat-trapping methane produced by bacteria and archaea. Free oxygen was rare, in part because the molecules rapidly reacted with iron and other elements to form new molecules. Life changed the planet’s chemistry when some bacteria evolved the ability to capture sunlight. They gave off oxygen as waste, and after 200 million years it began to build up in the atmosphere. Unless an organism can protect itself, oxygen can be lethal. Thanks to its atomic structure, oxygen is eager to attack other molecules, wresting away atoms to bond with. The new oxygen-bearing molecules can roam through a cell, wrecking DNA and other molecules they encounter.
For the first billion and a half years of life, the planet had been mercifully free of the oxygen menace. And then, 2.5 billion years ago, oxygen levels rose tenfold. The oxygen revolution may have driven many species extinct, while others found refuge in places where oxygen levels remained low—deep inside mudflats, for example, or at the bottom of the ocean. But some species, including the ancestors of E. coli, adapted. They acquired genes that protected them from oxygen’s toxic effects. Once shielded, their metabolism evolved to take advantage of oxygen, using it to get energy out of their food far more efficiently than before. Today E. coli can still switch back and forth between its ancient oxygen-free metabolism and its newer network, depending on how much oxygen it senses in its environment.