39. To be precise, the average amount of non–protein coding DNA found between two human genes exceeds one hundred thousand base pairs (bps) and is thus almost one thousand times greater than that found between two E.coli genes. The calculation is based on 2.9×109 base pairs as the size of the human genome, and on twenty-four thousand genes with an average protein-coding length of 1330 base pairs. See Table 3.2 of Lynch (2007). Some human genes are millions of base pairs apart.
40. A substantial fraction of vertebrate non-coding DNA is transcribed into RNA that is not translated into protein, some of which may also help regulate genes.
41. See Lynch and Conery (2000).
42. Other costs include that of transcribing the new gene into RNA, as well as the—comparatively small—cost of manufacturing the DNA building blocks of the additional DNA. The cost is not the same for all genes and depends on the amount of RNA and protein manufactured. See Wagner (2005, 2007).
43. Another difference is that in higher organisms, the energy costs of gene expression may be less important in determining reproductive success than many other factors, such as mobility, cognitive abilities, and attractiveness to mates.
44. See pages 60–61 of Lynch (2007). Mutations that inactivate gene duplicates are not the only—and not even the most important—source of pseudogenes. Another is a mechanism called retroposition that creates gene duplications and is very different from the DNA recombination and repair I discuss in the main text. Retroposition is a process in which the RNA transcribed from a gene is transcribed back into DNA by an enzyme called reverse transcriptase. Often, the resulting DNA is not a complete copy of the gene, or it integrates into a location of a genome that does not contain the necessary regulatory DNA words needed to transcribe the gene. Such genes are effectively dead on arrival and form a large reservoir of so-called retropseudogenes in our genome.
45. See Dawkins (1976).
46. There are multiple different kinds of mobile DNA, also referred to as transposable elements. They include transposons, long terminal repeat elements, as well as long and short interspersed nuclear elements. See pages 56–60 of Lynch (2007).
47. This selection can take different forms and has led to the “domestication” of some elements, favoring organisms whose mobile DNA causes mutations that are relatively harmless, for example, by inserting into gene-poor regions where its insertions are not likely to be damaging.
48. See Chapter 7, page 168, of Lynch (2007).
49. See pages 174–179 of Lynch (2007).
50. See page 178 of Lynch (2007) and Figure 5 of Lynch (2006).
51. See page 57 of Lynch (2007).
52. See pages 56–60 of Lynch (2007). Short interspersed nuclear elements are particularly abundant in our genome, which contains more than 1.5 million of them. They can only transpose passively, using the transposition enzymes from other mobile DNA to do so. They are the ultimate DNA parasites.
53. See Lynch (2007).
54. See Gilbert (1978).
55. See Lynch and Conery (2003) and pages 256–261 in Lynch (2007). My discussion of introns is based on so-called spliceosomal introns, which are characteristic of eukaryotes and do not exist in prokaryotes—another aspect of their simpler genome organization. And when I refer to microbes here I mean eukaryotic microbes, for example, unicellular fungi like baker’s yeast.
56. See Table 3.2 of Lynch (2007).
57. See page 51 and Table 3.2 of Lynch (2007). The simpler genome organization of bacteria also has advantages. It allows shorter generation times and facilitates the ability to transfer genes horizontally.
58. See Tables 3.1 and 3.2 of Lynch (2007).
59. See Chimpanzee Sequencing and Analysis Consortium (2005).
60. None of this implies that natural selection is unimportant. Selection is essential to reliably ascending fitness peaks in adaptive landscapes.
Chapter 4: Teleportation in Genetic Landscapes
1. See Hardison (1999), as well as Aronson et al. (1994). They are different solutions in the sense that their amino acid strings are very different, but they bind oxygen by the same mechanisms. Essentially, they are different texts expressing the same “meaning.”
2. See Wagner (2014).
3. See Hayden et al. (2011).
4. See Figure 6 of Bershtein et al. (2008). See also Wu et al. (2016) for a different kind of experimental demonstration showing that adding dimensions to an evolutionary journey may facilitate adaptation.
5. Similar networks of high-elevation ridges also occur in mathematical models of speciation—that is, of the evolution of reproductive isolation—where mathematical biologist Sergei Gavrilets calls them “holey” adaptive landscapes. See Gavrilets (1997).
6. The phrase is so well-known that it has its own Wikipedia page (https://en.wikipedia.org/wiki/Beam_me_up,_Scotty), even though Kirk, played by William Shatner, never used this exact phrase.
7. The assumption is that the two chromosomes in a pair come from unrelated individuals. If the mother and father of the individual whose homologous chromosomes are compared are related (i.e., they come from an inbred family), the number of differences could be much smaller. Also, the incidence of pairwise differences varies widely across different regions of the human genome. See Jorde and Wooding (2004).
8. To be precise, three billion is the approximate number of nucleotides in one set of twenty-three chromosomes, and to get the number of nucleotides in all chromosome pairs, this number would need to be doubled.
9. I am neglecting here that a small number of approximately 30–40 (germ-line) mutations occur during the life cycle of each human, as this number of differences pales in comparison to the amount of genetic change introduced by recombination between a typical mother and father. See Campbell and Eichler (2013).
10. The calculation is based on 1.5×106 steps and 2.5 feet per step, which amounts to 710 miles. If the distance from one “end” of the landscape to the other were given by the maximally possible number of nucleotides that could differ between two genomes (which is approximately equal to 3×109 for a genome of the same size of the human genome), then that distance would translate to 3×109 steps, which equals 7.5×109 feet or 1.42×106 miles (2.28×106 kilometers). Comparing that distance of more than two million kilometers to the distance between the earth and the moon (at a mere 384,400 kilometers) is another way to appreciate how vast this landscape is.
11. Such instant speciation through hybridization can sometimes take place when plants double their chromosome numbers, which can occur spontaneously as a result of DNA-replication or cell-division errors. See Futuyma (2009). Many hybrids may be less well adapted to any habitat than their parents are, but a few can discover entirely new “lifestyles.” See Arnold et al. (1999) and Arnold and Hodges (1995).
12. See Pennisi (2016).
13. See pages 492–493 of Futuyma (2009), as well as Rieseberg et al. (2007).
14. See Lamichhaney et al. (2018) and Lamichhaney et al. (2015), as well as Grant and Grant (2009) and Pennisi (2016). The descendants of the large bird are also an example of successful inbreeding because only two siblings survived the drought, mated, and founded a family lineage whose members largely reproduce with each other. Darwin’s finches are only a few among a growing number of species where it has become clear that hybridization is rampant. Its abundance softens “hard” boundaries between species because even species that were previously thought to be reproductively isolated—the very definition of a biological species—often turn out to exchange material in a process called introgressive hybridization, which need not lead to speciation. It takes place when hybrid organisms reproduce over multiple generations with members of either parental population. Allele combinations that help the hybrid survive in a new habitat can then become preserved in its genome. This phenomenon has been documented in diverse organisms, including some fungi that are involved in cheese making, malaria mosquitoes, and American gray wolves. See Arnold and Kunte (2017), as well as Pennisi (2016), Ro
pars et al. (2015), Norris et al. (2015), and Anderson et al. (2009).
15. See Bushman (2002). The process usually terminates before all genes in a genome have been transferred. It tends to transfer genes jointly that are in close proximity on the bacterial DNA, a fact that has been used to map genes in the E.coli genome. Because the transferred genes often include those encoding the ability to make a sex pilus and donate DNA, the genome transfer machinery may be spreading “selfishly” among bacteria. But in doing so, it also transfers other genes from the donor that may prove useful to the recipient. I also note that not all horizontal gene transfer involves bacterial sex (conjugation). Other recombination mechanisms include the uptake of naked DNA (transformation) and the shuttling of DNA from one cell to another through infectious viruses (transduction). Horizontal gene transfer also occurs between bacteria and plants, fungi, or animals, as well as among the latter three classes of organisms, although the mechanisms are not always understood. See Arnold and Kunte (2017).
16. Recombination can occur between bacteria that differ in 10 percent or more of their DNA text, compared to the average of 0.1 percent in humans. See Fraser et al. (2007). By comparison, for example, genomic diversity in sunflowers is greater than that in humans but is also below 1 percent of DNA text divergence. See Pegadaraju et al. (2013). Bacteria have different generation times, smaller mutation rates, and genomes more dense with genes. See Lynch (2007). For these reasons, the same degree of genomic difference between two bacteria and two higher organisms does not necessarily translate into the same amount of time since their most recent common ancestor lived.
17. See Gelvin (2003), as well as Robinson et al. (2013).
18. Some simple animals are in fact capable of photosynthesis, but they engage in symbioses with other organisms that provide this ability for them, and they do not use the synthesized carbohydrates as their only source of nutrition. An intriguing question is why more of them do not engage in these symbioses. See Smith (1991).
19. See Copley et al. (2012), Russell et al. (2011), and Maeda et al. (2003), as well as Hiraishi (2008).
20. I note that many antibiotic-resistance traits are only encoded by a single gene, which makes their spreading especially easy.
21. Some viruses, like the human immunodeficiency virus (HIV), undergo recombination while they evolve inside patients, but it would be hard to match the density and diversity of recombining molecules in a test tube.
22. More precisely, they (and DNA shuffling) use a heat-stable polymerase that is needed for the polymerase chain reaction. This reaction is an important tool of molecular biology to make many copies of a DNA sequence of interest. See Stemmer (1994).
23. See Crameri et al. (1998). They improved an enzyme that can cleave moxalactame, another cephalosporin antibiotic like cefotaxime.
24. See Ness et al. (1999) and Raillard et al. (2001), as well as Crameri et al. (1997).
25. More precisely, the organisms I am referring to do not reproduce sexually.
26. See Judson and Normark (1996).
27. See Flot et al. (2013).
28. However, in molecules like proteins one can create many recombinants experimentally and evaluate what percentage of them are functional proteins. See Drummond et al. (2005).
29. See Drummond et al. (2005) and Martin and Wagner (2009), as well as Hosseini et al. (2016). While this work simulates the effects of recombination on DNA, a limited amount of experimental work on proteins in Drummond et al. (2005) reaches the same conclusion.
Chapter 5: Of Diamonds and Snowflakes
1. See pages 81–114 of Gerst (2013).
2. See Kroto (1988) and Kroto et al. (1987).
3. See Smalley (1992).
4. Their original discovery is described in Kroto et al. (1985). For early reviews see Smalley (1992) and Kroto (1988). Bucky-balls are one of multiple allotropes—different physical forms—of carbon that also include graphite and diamond.
5. See Cami et al. (2010), as well as Berne and Tielens (2012) and Garcia-Hernandez et al. (2010).
6. See Campbell et al. (2015).
7. To be precise, van der Waals first postulated the existence of such a force.
8. More specifically, four atoms form a tetrahedron, and five atoms form a triangular bipyramid, which can be obtained by joining two tetrahedra. The numbers I cite in this paragraph come from experimental observations and theoretical calculations by Meng et al. (2010). I note that the number of valleys and how it scales with the number of atoms depends on the kinds of atoms and forces considered. See Wales (2003) and Berry (1993). Also, the numbers reported here come from free-energy landscape calculations, which take not only potential energy but also entropy into account. Entropy refers to the number of configurations that a given number of atoms can assume, and the laws of thermodynamics tell us that atoms, left to their own devices, will tend to maximize the number of configurations they can assume. In other words, while minimizing their potential energy, atoms will also tend to maximize their entropy. In combination, the two principles lead to even more complex landscapes. On a different note, it is worth pointing out that the most stable arrangements are not highly regular in all materials. They can even be irregular, for example in gold. See Michaelian et al. (1999). High regularity and symmetry mean that multiple minima correspond to the same stable atomic configuration but with permuted identities (“labels”) of atoms. All these factors—strength of force, entropy, symmetry—influence landscape structure, but they do not affect the central principle that the complexity of potential and free-energy landscapes increases exponentially with the number of atoms.
9. For example, in a cluster of merely thirty-two potassium chloride molecules, there are at least ten billion more minima of high potential energy corresponding to amorphous structures than there are low-lying minima corresponding to stable structures with an atomic configuration similar to the cubic arrangement familiar from rock salt. See page 2389 of Berry (1993).
10. See Oliver-Meseguer et al. (2012), Corma et al. (2013), as well as Michaelian et al. (1999). I note that the distinction between a molecule and a cluster is not clear-cut. See, for example, Chapter 1.2 of Wales (2003). The sizes of the gold clusters I mention are in the hundreds of picometers (10-12 meters) rather than nanometers (10-9 meters), such that one could assign them to the realm of “picotechnology.” See Cartwright (2012).
11. See Wales (2003).
12. Here it is worth pointing out some simplifications in the main text. For example, the importance of cooling in crystallization comes not only from the efficient exploration of an energy landscape, but also from the fact that the solubility of many solutes decreases with decreasing temperature, which makes more and more solute molecules available for crystal growth. This is also why (slow) evaporation of the solvent, which reduces its volume and thus increases the solute’s concentration, is a commonly used strategy for crystallization. If cooling or evaporation occurs too fast, many solute molecules will precipitate in amorphous clumps. Also, when a crystal forms, not all its constituent atoms or molecules simultaneously explore the crystal’s energy landscape. Instead, a crystal begins to form through nucleation, a process in which some of a solute’s molecules associate in the correct (minimum energy) configuration either by themselves or as prompted through an impurity, such as a dust particle in the solution. The crystal then grows from such a nucleus as more and more molecules aggregate. This means that the energy landscape is not explored haphazardly but rather in preferred directions that are given by how growth occurs. This is a specific example of a more general phenomenon in self-assembling molecular structures: kinetics can facilitate or hinder their formation. But even when our “marble” explores an energy landscape along some preferred direction, it may encounter shallow valleys separated by peaks—suboptimal and imperfect molecular arrangements—and so heat-induced jitters are still important.
13. Those where at least some atoms are covalently bound are known as covalent crystals. A prominent e
xample is diamond, where each carbon atom is bound to four adjacent ones in a highly regular arrangement that yields the octahedral symmetry of diamond.
14. See Smalley (1992) for a discussion of the temperatures at which bucky-balls form and for the importance of (relatively) slow cooling for a high yield of bucky-balls.
15. See page 501 of Wales (2003) for the short assembly time of bucky-balls. It also testifies to the importance of assembly kinetics, the exploration of an energy landscape along preferred directions, which comes from the observation that bucky-balls do not assemble from scratch in one step but instead build themselves piece by piece from smaller yet already highly regular molecules. See Kroto (1988) and Smalley (1992).
16. An additional complication is that some materials are polymorphic; that is, they can form alternative but similarly stable crystal structures.
17. This is only one of the many complexities involved in growing snowflakes. See Libbrecht (2005). It is another example where self-assembly kinetics is important.
18. Some defects of bucky-balls are visualized in Chapter 8.6 of Wales (2003).
19. This statement about the majority of carbon atoms refers to the fraction of carbon atoms that get bound in large clusters, more than 50 percent of which can be bucky-balls under the right conditions. See Kroto et al. (1985). Even overall yields exceeding 20 percent, as have been reported, are remarkable. See page 108 of Smalley (1992).
Chapter 6: Creative Machines
1. See Biery (2014).
2. For the vehicle routing problem with n customers, the depot is not included in the city count, so the basic count of tours is n!=1×2×..×n. One cannot generally assume that a route and the route obtained from it by reversing the customer order are equally long. For example, the routes from the depot to the first and last customer may have different lengths, or one-way streets may exist. This means that the number n! of routes cannot be reduced further by taking such “symmetries” into account. For the closely related traveling salesman problem, the number of routes is (n-1)!, because one of the n cities (e.g., the starting point) is arbitrarily chosen (it corresponds to the “depot”), and the remaining (n-1) cities can be permuted at will. For both problems, the number of solutions grows faster than exponentially.
Life Finds a Way Page 27