In the metabolic library, though, that’s exactly what a browser can do. Its myriad texts with the same meaning could be like stars in our universe, islands separated by vast expanses of dark empty space. But they are not. You can travel between them on a network of well-lit paths.
Thus far, we had cataloged only the volumes of one subject area—viability on glucose—but many other subject areas exist. There are metabolisms that are viable on ethanol, acetate, and dozens of other fuels. And we mapped them, using the same random browsing strategy: different random walks whose editing steps preserved the phenotype—viability on ethanol, for example—and stopped only when we could walk no further. We did that for eighty different fuels, and each time we saw the same pattern. Viable metabolisms can have very different texts—they share as little as 20 percent of their reactions—and form a vast connected genotype network in the metabolic library.
Emboldened by this general pattern, we began to map metabolisms viable on multiple different fuels like ethanol, glucose, and acetate, able to synthesize all biomass molecules from each of them. (The advantage of this ability is obvious: It permits survival when the supply of any one fuel runs out.) Because that metabolic skill would be more difficult to achieve, perhaps only a few metabolisms might have it, all of them shelved in one corner of the library? We were again proven wrong. We studied metabolisms viable on five, ten, twenty, and up to sixty different fuel molecules. Each meaning-preserving random walk starting from one of them led far away. Even some metabolisms viable on sixty different fuels shared fewer than 30 percent of reactions. And the metabolisms with the same phenotype—countless trillions in each subject category—again formed a connected genotype network.45
At this point, I was close to ecstatic. We had stumbled upon fundamental principles that govern the metabolic library’s organization. First, many metabolisms are viable on the same fuel molecules—it matters little which fuels you choose. Organisms can assemble biomass building blocks in many ways, through many different sequences of reactions. Second, many of these metabolisms are very different from each other, sharing only a minority of reactions. Third, the viable metabolisms we found were connected in a gigantic network—a genotype network. This genotype network reached far through the space of metabolism.46 Each subject area has one such genotype network, and these networks form a densely woven fabric in the metabolic library.
We had accomplished all this with modest means, because our computing power was puny when compared to the number of texts in the metabolic library. We had crudely mapped a world vast beyond imagination. We had crossed an ocean in a bathtub.
Myriad metabolic texts with the same meaning raise the odds of finding any one of them—myriad-fold. Even better, evolution does not just explore the metabolic library like a single casual browser. It crowdsources, employing huge populations of organisms that scour the library for new texts. Every time gene transfer alters the metabolic genotype of an organism, it takes a step through the library. Different readers—billions of them—walk off in different directions to explore the library.
Evolution’s library exploration also differs in another way from how we humans would browse a library. Imagine a hapless organism that steps off the path connecting viable metabolisms by encountering a change—perhaps a gene deletion—that disrupts the metabolic instructions for manufacturing a key molecule. That organism would be dead, courtesy of natural selection. In the metabolic library readers die (and others get born) in an exploration that unfolds over generations.
Viewed from afar, the library’s explorers, from bacteria to blue whales, might appear like giant clouds of dust grains—dwarfed by the library itself—drifting this way and that, from one stack to the next, endlessly meandering swirls of living things that try new combinations of chemical reactions over and over and over again. Some die. Others survive, and pass innovative combinations on to subsequent generations. This churning mass of life is evolution in action.
That action would vanish if genotype networks did not exist. If only one text could confer viability on any one fuel, then all members of a population would have to share that text, crowding around it in the library. Whenever a member stepped aside to sneak a peak at another volume, it would die. If only a few similar texts were viable, the population could only explore a tiny section of the library. But because of genotype networks, evolving populations can explore the library far and wide.
Genotype networks are the first of two keys to innovability. And now for the second: the immense diversity of the neighborhoods where these explorations begin.
Imagine a patch of soil where billions of bacteria can thrive as long as new food arrives occasionally—a leaf blown in from afar, a rotting carcass, or perhaps a ripe apple dropping from a tree. Many molecules in these foods are nutritious, but they would be useless unless one of the microbes had acquired the right combination of enzymes, the right metabolic text for transforming the food into biomass. That very text could be lifesaving once the billions of its soil-mates, all of them hungry, have consumed and exhausted other fuel molecules. This text, a metabolic innovation, could give one microbe a new lease on life.
Even if there were only a hundred fuel molecules, some 1030 metabolic phenotypes would exist, and finding this text is to find a specific one of them. There is simply no way you could pack 1030 texts into a small neighborhood of the library. Each neighborhood only has enough space for a few thousand texts, whose meanings can encompass a tiny fraction—one in 1026—of all possible phenotypes. It’s as if you borrowed a few volumes at random from the New York Public Library to fill your nightstand and hoped to find The Origin of Species among them—almost impossible. But these odds change if a crowd of readers can browse the library along a genotype network that extends far through the library. Because genotype networks are so large, the population could explore thousands of neighborhoods and increase the odds of finding the new lifesaving phenotype.
You may have noticed a hidden premise here: Different neighborhoods must contain different novel phenotypes. Near a volume on photovoltaics, you would find others on medieval French literature, twentieth-century architecture, and Italian cooking, whereas near another volume on photovoltaics—in a different part of the library—books on toy trains, World War II, and astrophysics would be shelved. In metabolic terms, you might find metabolisms viable on acetate and ethanol and citrate in one neighborhood, and metabolisms viable on sucrose and fructose in another.
To find out whether this bizarre library organization really exists, we chose pairs of metabolic texts that had the same phenotype (viability on glucose) but that were otherwise very different. The two metabolisms, A and B, were located in different parts of the library—they did not share many reactions—yet both were part of the same genotype network. We then examined the phenotypes of all their five-thousand-odd neighbors, and found that some of them were likewise viable on glucose—they belonged to the same genotype network—while others had lost a critical chemical reaction, which spells death. Yet other neighbors—those we were really interested in—could live on a new combination of fuels, such as ethanol or fructose. For these networks we asked: Do the neighbors of metabolic genotype A—those texts that differ from A in only a single reaction—contain metabolic innovations different from those of the neighbors of metabolic genotype B? If the neighborhood of A contained metabolisms viable on the new fuels ethanol and fructose, would the neighborhood of B contain metabolisms viable on, say, acetate and sucrose?
After analyzing thousands of network pairs, and after studying phenotypes involving eighty different fuel molecules, we had found that the premise was correct. Different neighborhoods contain texts with new meanings, but these meanings differ between neighborhoods. Most metabolic innovations are unique to one neighborhood and do not occur in the other. (Because each new phenotype has its own genotype network, this also means that different genotype networks in the library are interwoven in an unfathomably complex way.)
We th
en went one step further. With our computers’ help, we wandered once again through a genotype network in the metabolic library, except that now we behaved like inventory clerks with their notepads, listing all the innovations in the immediate neighborhood of our path—all innovations that were within easy reach. We listed all the different new phenotypes in the walker’s neighborhood before the start of the walk, and examined the neighborhood again after the first step. If it contained a new phenotype that was not already on the list, we added it to the list, took one further step, examined the new neighborhood, added any new phenotypes, and so on, for thousands of steps. Because we knew that different neighborhoods contain different innovations, we expected the list to grow over time, as new phenotypes became accessible. But we expected that we would run out of new phenotypes eventually.
Wrong. Long after our notepads were full, we were still encountering innovations.
Worried that this trip had yielded an unusually rich bounty, we went on many more trips, from different starting points in the library, metabolisms viable on different fuel molecules. And we also crowdsourced our shopping, exploring the library not with a single metabolism but with entire populations of evolving metabolisms to tally how many different new phenotypes they found. In every instance, innovations continually piled up, with no sign of slowing down, at a steady clip, unceasingly, no matter how long the exploration continued, a hundred, a thousand, or ten thousand steps, hours, days, weeks, until we ran out of time and needed to do other work. We realized that the innovability of an evolving metabolism would not exhaust itself in our lifetime.47
Innovability in the metabolic library is near limitless, and for that both genotype networks and diverse neighborhoods are required. They are the two keys to innovability. Genotype networks guarantee that evolving populations can explore the library. Without them the lethal punishment of losing viability would be inevitable. But without diverse neighborhoods in this library, exploring a genotype network would be pointless: The exploration would not turn up many texts with new meanings.
Any librarian who wanted to organize a human library in this way would be locked away. Even if a thousand books told the same story in different ways, no sane librarian would create sections that placed books with all manner of different meanings next to one another. And he would certainly not pack different neighborhoods around synonymous texts with books in different subject categories.
But a closer look reveals that the metabolic library’s catalog is far from a madman’s febrile fantasy. Human libraries are useful only because we have librarians who make catalogs suitable for us, where books on photovoltaics stand on one shelf, those on French literature on another, and so on. For a library whose readers have no catalog and can only take random steps, and where missteps are punishable by death, it would be disastrous, because they would be stuck on whatever shelf they started. They would be idiots savants, world experts in one area but completely ignorant in all others, and could never learn anything new—not a smart strategy for surviving in an ever-changing world. For such readers, the metabolic library is perfect, uncannily well set up for innovation. It guarantees eternal learning and innovability.
Even more uncanny: Life’s other libraries are organized the same way.
CHAPTER FOUR
Shapely Beauties
The Arctic cod is a slender, brownish fish, with a silver belly and black fins, between eighteen and thirty centimeters long, a perfectly unremarkable occupant of the world’s oceans. Except for one thing: The Arctic cod—Boreogadus saida—lives and thrives within six degrees of the North Pole, nine hundred meters beneath the surface, in waters that regularly chill below zero degrees Celsius.
At that temperature, the internal fluids of most organisms turn into ice crystals with edges as beautiful as well-forged swords, and just as deadly, for they carve up living tissue like butter. Warm-blooded animals have a built-in thermostat that allows them to survive in subfreezing weather. Fish don’t. And yet, there’s the Arctic cod.
B. saida survives by producing antifreeze proteins that lower the freezing temperature of its body fluids, much like the antifreeze in a car’s engine coolant. These proteins are prototypical examples of nature’s innovative powers. Change the amino acid sequence needed to produce a particular protein and, presto, huge areas of the earth’s oceans become livable.1
Antifreeze proteins are among thousands of innovative wonders that populate the cells of fish and of every other living being. If you could shrink yourself and travel through a cell, you would first be astonished by how many different kinds of molecules there are, millions of them. Tiny molecules like water, larger molecules like sugars or amino acids, and even larger macromolecules like proteins all push and jostle and shove past each other like subway commuters during rush hour.
Proteins, the giant hulking monsters of a cell’s molecular population, are life’s workhorses. We have met the metabolic enzymes that synthesize everything a cell needs—including their own amino acids—by linking smaller molecules, cleaving them like molecular scissors, or simply rearranging their atoms.2 But not all proteins are enzymes. Some are molecular motors, like the proteins that help your muscles contract, or like the kinesins that “walk” along stiff molecular cables that crisscross the cell, carrying tiny membrane vesicles that shelter various molecular cargoes. Mayhem ensues when these truckers of the cell no longer do their job. One kinesin, for example, transports building materials needed to wire the cells in our nervous system, and mutations in its gene can cause an incurable disease called Type 2A Charcot-Marie-Tooth disease, which hampers movement and sensation in feet and hands.3
Yet other proteins attach to DNA and switch genes on or off. These regulatory proteins allow the information encoded in a gene to become transformed into an amino acid string. Hundreds of such regulators work simultaneously, each of them flipping the switch on some genes but not on others. (They are the source of yet another kind of innovation we will explore in chapter 5.)
And there’s more: rigid protein rods that form a cell’s molecular skeleton, proteins that import nutrients, proteins that dump waste outside the cell, proteins that relay molecular messages between cells, and on and on.
Each of these proteins has its own special talent, expressed in its phenotype, whose most important aspect is shape.4 I do not just mean the molecular shape of the twenty kinds of amino acids in proteins, and the order in which they are strung together—the primary structure of a protein.5 I mean the shape this string forms in space through the protein folding process that I first mentioned in chapter 1.
Hydrophilic amino acids love to be near the water that surrounds them, whereas hydrophobic amino acids avoid water—like the oily parts of membrane molecules—and these molecular sympathies help an amino acid string fold in a stereotypical manner. Driven by heat’s vibrations, a folding protein explores many shapes of its amino acid chain until it finds one where most water-avoiding amino acids cluster together and form a densely packed core that is surrounded by the water-loving molecules on the protein surface.6 What’s more, some amino acids attract and others repel each other, and these chemical sympathies also influence a protein’s fold. The protein folding process—driven by nothing but erratically bouncing molecules—is yet another reminder of the power of self-organization. It occurs millions of times a day in each of our trillion cells, whenever they manufacture a new protein string.
Viewed atom by atom, the three-dimensional fold of a protein like the sugar-splitting sucrase from chapter 2 would appear like a shapeless blob. But by stepping back and focusing on the string that holds the amino acid beads together (figure 10), one can discern regular and repeated patterns of amino acid arrangements in space that occur in many proteins. They include helixlike corkscrews—one is labeled in the figure—or several parallel strands, an arrangement also called a sheet.7 Helices and sheets are major elements of protein folds and form the secondary structure of a protein. And these helices and sheets, together with the
strands that connect them, form the labyrinthine three-dimensional or tertiary structure of a protein’s fold shown in figure 10.
FIGURE 10. Sucrase folded in space
Even though it may look like a tangled pile of spaghetti, the fold in figure 10 is actually highly organized: Any two sucrase amino acid strings spontaneously fold into the exact same shape.8 This shape is critical to a protein’s function, because its helices and sheets guide and constrain the endless heat-induced vibrations and oscillations of the folded protein. Such constrained movement allows enzymes like sucrase to cleave sucrose—a bit like the blades of scissors whose movement is constrained by the pivot that connects them and enables them to slice paper.9 Because heat-caused vibrations are so important to enzymes, these molecules also have an optimal temperature: Too little heat and their vibrations are not powerful enough to reorganize molecules. Too much heat and the vibrations shake the fold apart—into the linear string of amino acids. Worse than that, unfolded proteins often aggregate into large inert clumps like the proteins in boiled eggs. Such clumps of unfolded proteins are more than useless: When too many of them accumulate, for example in your brain, they bring forth horrible diseases like Alzheimer’s.10
In the bewildering realm of oscillating shapes that sucrase and other proteins inhabit, each shape has a specific job. Each is well suited for what it does and highly complex. In the words that Darwin used to describe the living world, it is a world of “endless forms most beautiful,”11 but these forms—unknown to Darwin—keep the living world alive.
Arrival of the Fittest: Solving Evolution's Greatest Puzzle Page 11