What Is Life?
Page 8
Information processing permeates all aspects of life. To illustrate this, let’s look at two examples of complex cellular components and processes that are best understood through the lens of information.
The first is DNA and the way its molecular structure explains heredity. The critical fact about DNA is that each gene is a linear sequence of information written in the four-letter language of DNA. Linear sequences are a familiar and highly effective strategy for storing and conveying information; it’s the one used by the words and sentences that you are reading here, and also the one used by the programmers who wrote the code for the computer on your desk and the phone in your pocket.
These different codes all store information digitally. Digital here means that information is stored in different combinations of a small number of digits. The English language uses 26 basic digits, the letters of the alphabet; computers and smartphones use patterns of ‘1’s and ‘0’s; and DNA’s digits are the four nucleotide bases. One great advantage of digital codes is that they are readily translated from one coding system into another. This is what cells do when they translate the DNA code into RNA and then into protein. In doing so, they transform genetic information into physical action, in a seamless and flexible way that no human-engineered system can yet match. And whilst computer systems must ‘write’ information onto a different physical medium in order to store it, the DNA molecule ‘is’ the information, which makes it a compact way to store data. Technologists have recognized this and are developing ways to encode information in DNA molecules to archive it in the most stable and space-efficient way possible.
DNA’s other critical function, its ability to copy itself very precisely, is also a direct consequence of its molecular structure. Considered in terms of information, the molecular attractions between the pairs of bases (A to T, and G to C) provide a way to make very precise and reliable copies of the information held by the DNA molecule. This intrinsic replicability ultimately explains why information held in DNA is so stable. Some gene sequences have persisted through unbroken series of cell divisions over immense durations of time. Large parts of the genetic code needed to build the various cellular components, such as the ribosomes, for example, are recognizably the same in all organisms, be they bacteria, archaea, fungi, plants or animals. That means the core information in those genes has been preserved for probably three billion years.
This explains why the double helix structure is so important. By revealing it, Crick and Watson created a bridge linking together the geneticists’ ‘top down’ conceptual understanding of how the information needed for life is passed down through the generations, with the ‘bottom up’ mechanistic understanding of how the cell is built and operated at the molecular scale. It emphasizes why the chemistry of life only makes sense when it is considered in terms of information.
The second example where information is key to understanding life is gene regulation, the set of chemical reactions cells use to turn genes ‘on’ and ‘off’. What this provides is a way for cells to use only the specific portions of the total set of genetic information that they actually need at any given moment in time. The critical importance of being able to do this is illustrated by the development of a formless embryo into a fully formed human being. The cells in your kidney, skin and brain all contain the same total set of 22,000 genes, but gene regulation means the genes needed to make a kidney were turned ‘on’ in embryonic kidney cells, and those that function specifically to create skin or brain were turned ‘off’, and vice versa. Ultimately, the cells in each of your organs are different because they use very different combinations of genes. In fact, only about 4,000, or a fifth, of your total set of genes are thought to be turned on and used by all the different types of cells in your body to support the basic operations needed for their survival. The rest are only used sporadically, either because they perform specific functions only required by some types of cell, or because they are only needed at specific times.
Gene regulation also means that exactly the same set of genes can be used to create dramatically different creatures at different stages of their lives. Every elaborate and complex brimstone butterfly starts out as a rather less impressive green caterpillar; the dramatic metamorphosis from one form to the other is achieved by drawing on different portions of the same total set of information stored in the same genome and using it in different ways. But gene regulation is not only important when organisms are growing and developing, it is also one of the main ways all cells adjust their workings and structures to survive and adapt when their environments change. For example, if a bacterium encounters a new source of sugar, it will quickly turn on the genes it needs to digest that sugar. In other words, the bacterium contains a self-regulating system that automatically selects the precise genetic information it needs to improve its chances of surviving and reproducing.
Biochemists have identified many of the basic mechanisms used to achieve these various feats of gene regulation. There are proteins that function as so-called ‘repressors’ that turn genes off, or ‘activators’ that turn genes on. They do this by seeking out and binding to specific DNA sequences in the vicinity of the gene being regulated, which then makes it either more or less likely that a messenger RNA is produced and sent to a ribosome to make a protein.
It is important to know how all this works at the chemical level, but as well as asking how genes are regulated, we will also want to understand which genes are regulated, whether they’re being turned on or off, and why. Answering these questions can lead to new levels of understanding. They can start to tell us about how the information held in the genome of a rather uniform human egg cell is used to instruct the formation of all the hundreds of different types of cell present in an entire baby; how a new heart drug can turn genes on and off to correct the behaviour of cardiac muscle cells; how we might re-engineer the genes of bacteria to make a new antibiotic; and much more besides. When we start to look at gene regulation in this way it is clear that concepts based on information processing are essential to understanding how life works.
This powerful way of thinking emerged from studies made by Jacques Monod and his colleague François Jacob; work that earned them a Nobel Prize in 1965. They knew that the E. coli bacteria they studied could live on one or the other of two sugars. Each sugar needed enzymes made by different genes to break it down. The question was, how did the bacteria decide how to switch between the two sugars?
These two scientists devised a brilliant series of genetic experiments that revealed the logic underlying this particular example of gene regulation. They showed that when bacteria are feeding on one sugar, a gene repressor protein switches off the key gene needed for feeding on the alternative sugar. But when the alternative sugar is available, the bacteria rapidly switch back on the repressed gene for digesting that sugar. The key to that switch is the alternative sugar itself: it binds to the repressor protein, stopping it from working properly, and thereby allowing the repressed gene to be turned back on. This is an economical and precise way of achieving purposeful behaviour. Evolution has devised a way for the bacterium to sense the presence of an alternative energy source, and to use that information to adjust its internal chemistry appropriately.
Most impressively, Jacob and Monod managed to work all of this out at a time when nobody could directly purify the specific genes and proteins involved in this process. They solved the problem by looking at their bacteria through the prism of information, which meant they did not need to know about all the specific ‘nuts and bolts’ of the chemicals and components underpinning the cellular process they were studying. Instead, they used an approach based on genetics, mutating genes involved in the process and treating genes as abstract informational components that controlled gene expression.
Jacob wrote a book called The Logic of Life and Monod wrote one called Chance and Necessity. Both covered similar issues to those I am discussing in this book and both greatly influenced me. I never knew Monod, but met Jaco
b a number of times. The last time I saw him, he invited me to lunch in Paris. He wanted to talk about his life and discuss ideas: how to define life, the philosophical implications of evolution and the contrasting contributions made by French and Anglo-Saxon scientists to the history of biology. Constantly fidgeting due to old war wounds, he was the archetypal French intellectual, incredibly well-read, philosophical, literary and political – a great and memorable meeting for me.
Jacob and Monod were working at a time when understanding was emerging of how information flowed from gene sequence to protein to cellular function, and how that flow was managed. This information-centred approach also guided my thinking. When I started my research career I wanted to know how the cell interpreted its own state and organized its internal chemistry to control the cell cycle. I did not want just to describe what happens during the cell cycle, I wanted to understand what controlled the cell cycle. That meant I often came back to thinking about the cell cycle in terms of information and considering the cell not only as a chemical machine but also as a logical and computational machine, as Jacob and Monod considered it – one that owes its existence and future to its ability to process and manage information.
In recent decades biologists have developed powerful tools and invested a lot of effort in identifying and counting the diverse components of living cells. For example, my lab put a lot of work into sequencing the whole genome of the fission yeast. We did this with Bart Barrell, who had worked with Fred Sanger, the person who invented the first practical and reliable way to sequence DNA back in the 1970s. I met Fred several times during this project, although he had officially retired by then. He was a rather quiet, gentle man, who liked growing roses, and, similar to many of the most successful scientists I have met over the years, always generous with his time, talking to and encouraging younger scientists. When he came to Bart’s lab, he looked like a gardener who had lost his way, a gardener who had, of course, won two Nobel Prizes!
Together, Bart and I organized a collaborative effort of about a dozen labs from around Europe to read all of the approximately 14 million DNA letters in the fission yeast genome. It took about 100 people and around three years to complete, and was, if I remember correctly, the third eukaryote to be completely and accurately sequenced. That was around 2000. Now the same genome could be sequenced by a couple of people in about a day! Such have been the advances in DNA sequencing over the last two decades.
Gathering data like this is important, but only as a first step towards the crucial, and more challenging, aim of understanding how it all works together. With this objective in mind, I think most progress will be made by looking at the cell as being made up of a series of individual modules that work together to achieve life’s more complex properties. I use the word module here to describe a set of components that function as a unit in order to execute a particular information-processing function.
By this definition, Watt’s governor would be a ‘module’, one with the clearly defined purpose of controlling the speed of an engine. The gene regulatory system Jacob and Monod discovered for controlling sugar usage in bacteria is another example. In terms of information, they both work in a similar way: they are examples of information-processing modules called negative feedback loops. This kind of module can be used to maintain a steady state, and they are employed widely in bio-logy. They work to keep your blood sugar levels relatively constant, even after you consume a sweet snack like a sugar-coated doughnut, for example. Cells in your pancreas can detect an excess of sugar in your blood and respond by releasing the hormone insulin into your bloodstream. Insulin, in turn, triggers cells in your liver, muscles and fatty tissues to absorb sugar out of your blood, reducing your blood sugar, and converting it into either insoluble glycogen or fat, which is then stored for later use.
A different type of module is the positive feedback loop, which can form irreversible switches that once turned on are never turned off. A positive feedback loop works in this way to control the way apples ripen. Ripening apple cells produce a gas called ethylene, which acts to both accelerate ripening and to increase the production of ethylene. As a result, apples can never get less ripe, and neighbouring apples can help each other to ripen more quickly.
When different modules are joined together, they can produce more sophisticated outcomes. For example, there are mechanisms that produce switches that can flip reversibly between ‘on’ and ‘off’ states, or oscillators that rhythmically and continuously pulse ‘on’ and ‘off’. Biologists have identified oscillators that work at the level of gene activities and protein levels – these are used for many different purposes, for example to differentiate between day and night. Plants have cells in their leaves that use an oscillating network of genes and proteins to measure the passing of time, and thereby allow the plant to anticipate the start of a new day, turning on the genes needed for photosynthesis just before it gets light. Other oscillators pulse on and off as a result of communication between cells. One example is the heart that is beating in your chest right now. Another is the oscillating circuit of neurons that ticks away in your spinal cord, which activates the specific pattern of repeated contractions and relaxations of leg muscles that allows you to walk at a constant pace. All without you having to give it any conscious thought.
Different modules link together in living organisms to generate more complex behaviours. A metaphor for this is the way the different functions of a smartphone work. Each of those functions – the phone’s ability to make calls, access the internet, take photographs, play music, send emails and so on – can be considered like the modules operating in cells. An engineer designing a smartphone has to make sure all these different modules work together so the phone can do everything it needs to do. To achieve this, they create logical maps that show how information flows between the different modules. The great power of starting the design of a new phone at the level of modules is that engineers can make sure their plans make functional sense, without getting lost in the details of individual parts. That way, they need not initially give too much thought to the huge number of individual transistors, capacitors, resistors and countless other electronic components that make up each module.
Adopting the same approach provides a powerful way to comprehend cells. If we can understand the cell’s different modules and see how cells link them together to manage information, we don’t necessarily need to know all the minute molecular details of how each module works. The overriding ambition should be to capture meaning, rather than simply catalogue complexity. I could, for example, give you a list containing all the different words printed in this book, together with how frequently they occur. This catalogue would be like having a parts list without an instruction manual. It would give a sense of the complexity of the text, but almost all of its meaning would be lost. To grasp that meaning, you need to read the words in the correct order and develop an understanding of how they convey information at higher levels, in the form of sentences, paragraphs and chapters. These work together to tell stories, give accounts, connect ideas and make explanations. Exactly the same is true when a biologist catalogues all the genes, proteins or lipids in a cell. It is an important starting point, but what we really want is an understanding of how those parts work together to form the modules that keep the cell alive and able to reproduce.
Analogies derived from electronics and computing, like the smartphone example I used just now, are helpful in understanding cells and organisms, but we must use them with care. The information-processing modules used by living things and those used in human-made electronic circuitries are in some respects very different. Digital computer hardware is generally static and inflexible, which is why we call it ‘hardware’. By contrast, the ‘wiring’ of cells and organisms is fluid and dynamic because it is based on biochemicals that can diffuse through water in the cells, moving between different cellular compartments and also between cells. Components can be reconnected, repositioned and repurposed much more freely in
a cell, effectively ‘rewiring’ the whole system. Soon, our helpful hardware and software metaphors begin to break down, which is why the systems biologist Dennis Bray coined the insightful term ‘wetware’ to describe the more flexible computational material of life. Cells create connections between their different components through the medium of wet chemistry.
This is also true in the brain, the archetypal and highly complex biological computer. Throughout your life, nerve cells are growing, retracting and making and breaking connections with other nerve cells.
For any complex system to behave as a purposeful whole, there needs to be effective communication between both the different components of the system and with the outside environment. In biology, we call the set of modules that carry out this communication signalling pathways. Hormones released into your blood, like the insulin that regulates your blood sugar, are one example of a signalling pathway, but there are many others too. Signalling pathways transmit information within cells, between cells, between organs, between whole organisms, between populations of organisms and even between different species across whole ecosystems.
The way signalling pathways transmit information can be adjusted to achieve many different outcomes. They can send signals that simply turn an output on or off, like a light switch, but signals can work in more subtle ways too. In some situations, for example, a weak signal switches on one output and a stronger signal switches on a second output. In a similar way a whisper gets your immediate neighbour’s attention, but a shout is needed to evacuate a whole room in an emergency. Cells can also exploit the dynamic behaviour of signalling pathways to transmit a far richer stream of information. Even if the signal itself can only be ‘on’ or ‘off’, more information can be transmitted by varying the time spent in each of those two states. A good analogy is Morse code. Through simple variations in the duration and order of signal pulses, the ‘dots’ and ‘dashes’ of Morse code can convey streams of information that overflow with meaning, be it an SOS call or the text of Darwin’s On the Origin of Species. Biological signalling pathways that behave in this way can generate information-rich properties that carry more meaning than signalling sequences conveying a simple ‘yes/no’ or ‘on/off’ message.