Book Read Free

The Mysterious World of the Human Genome

Page 18

by Frank Ryan


  Together we watch how the tightly wound spiral of DNA thread that is wrapped around the tightly packed histones loosens up again.

  “So you know what has just happened?”

  “The gene—or whatever sequence is coded by this stretch of the DNA—is closed down when the thread is packed up tight.”

  “And ready for translation when it uncoils—exactly!”

  “So the histone code switches a gene on or off, just like the methylation status?”

  “It may look very simple, but there is nothing remotely accidental in the acetyl, phosphate, or methyl groups cozying up to the tails. It is under a very careful control by other elements within the epigenetic control system that would make the secret police force of a dictatorship look like amateurs. And just like the methylation status, the histone code is also amenable to change within the lifetime of the individual. It is responsive to stimuli entering the genome, through environmental influences. And, like methylation, it has the potential to change heredity and thus bring about evolutionary change without changing the DNA of the genetic code.”

  “Just how powerful,” you ask, “is this histone code?”

  “Let me give you a single example. That protein cloud we saw is actually an enzyme called a ‘deacetylase.’ What it did was remove the acetyl chemical group from the tails of a single histone pack. Its name, in the jargon, is deacetylase HDAC11, and we watched it switch off a gene that codes for a protein involved in the body's immune system. That protein decides whether you or I will respond to a certain antigen as self or foreign. In medical terms, this single ‘epigenetic mark’ will influence in an important way our future immune tolerance—in other words, how we might respond to a dangerous invading microbe or how, if we suffered an organ failure, we might react to an organ transplant.”

  I can explain through another example. Identical twins—in the jargon, “monozygotic twins”—are conceived as clones of one another. Thus they are conceived, and live out their lives, with identical genomes. The sum of all of the epigenetic systems in the body is known as the “epigenome.” Identical twins are conceived from the same pluripotent cells, so they begin as embryos with identical epigenomes. We formerly believed that this implied that identical twins were also born with identical epigenomes, but now we know that this is not true. The epigenome of every fetus, including those of identical twins, has already begun to change by the time of birth in response to environmental influences arriving into the physiology of the fetus during development in the womb. Of course it doesn't stop there. A study in Spain showed that depending on the circumstances in which each of two identical twins grows up and lives out their lives, they continue to accumulate these epigenetic differences.

  In practice, the epigenetic silencing of genes through methylation will often be reinforced by a silencing histone code being applied to the same gene's promoter—a belt-and-braces guarantee that the gene remains silent.

  Perhaps we should take a breather. I want you suitably rested before we confront the newly discovered and even more extraordinary epigenetic story of what some geneticists formally, and somewhat disparagingly, called DNA's Cinderella sister—that second nucleotide molecule called ribonucleic acid. In short, RNA.

  Of course, we have already come across RNA. Outshone by its stellar sister molecule, DNA, we rather assumed that RNA had had its day in some ghastly early model of the earth, a time before the greening of the planet, when life was still bogged down in a chemical stage of its evolution, with self-replicators competing with one another for the grubby chemicals they needed in the dirt of the primeval planet. It was understandable, in retrospect, that amid the fabulous catalog of discoveries deriving from the discovery of DNA and how it coded for proteins, scientists thought of genes, and the human genome, more or less exclusively in terms of DNA as the master molecule.

  Today, however, we recognize that this was a blinkered vision—and it was this same blinkered vision that led to half the human genome remaining blank in that pie chart dating back to 2001. The solution to that enigma lay in the more recent discoveries of extraordinary new roles for RNA in the burgeoning new discipline of epigenetics. This is changing much of what we formerly assumed about genetics, biology, molecular biology, and medicine. These new discoveries are so bracing and challenging that we are obliged to rethink our views on how the genome works. This dilemma is throwing up some very fundamental questions. What, for example, do we really mean by a gene? If we adhere to the concept of the gene as the unit of heredity then we shall have some redefining to do. For example, Thomas Gingeras, one of the investigating experts involved in the ENCODE Project, goes so far as to argue that the fundamental unit of the genome—the basic unit of heredity—should no longer be the gene at all but the RNA transcript decoded from DNA.

  We had many discussions on DNA, for I had come to Oxford with two half ideas, both of which were more than half wrong.

  SYDNEY BRENNER

  This new chapter of discovery began as long ago as 1991 with a discovery by American biologists Victor Ambros, Rosalind Lee, and Rhonda Feinbaum when they were studying a single gene, called lin-14, which regulates development in the worm C. elegans. We might recall that this tiny worm was the experimental subject chosen by Crick's friend and colleague Sydney Brenner for his pioneering experiments into the genes and molecular biology of development. We might also recall that this tiny worm proved so helpful and amenable to Brenner and his colleagues that it subsequently became the test organism for thousands of laboratory experiments all over the world. Brenner's own studies were extended by the biologists Robert Horvitz in the United States and John Sulston in England, where their efforts were crowned by the Nobel Prize in Physiology or Medicine in 2002. In the press release from the Nobel Institute, the award was made for their discoveries in the “genetic regulation of organ development and programmed cell death.”

  The italics are mine, because I want to draw attention to what those words might imply.

  In an adult human being, more than a thousand billion cells are created every day through cell division, or “mitosis.” In every such cell division the entire genome is copied. At the same time, an equal number of cells die through a form of controlled suicide. This is what is referred to as “programmed cell death,” or “apoptosis.” It is amazing, when we think about it, that death as well as life is programmed into our genome; and Brenner's work led to our first understanding of the genetics involved in bringing about death. Specific regulatory genes and genetic pathways are involved in this darker side of programming. And yes, RNA—that strange, almost quixotic, sister molecule—the Cinderella of the nucleotide sisters—is involved in this curious regulation.

  A variety of tiny RNA molecules, between 20 and 30 nucleotides long, was discovered back in 1991, but scientists were unsure what they represented. A few years later, the UK-based botanist David Baulcombe, together with his colleague Andrew Hamilton, found that some small interactive RNA molecules, or siRNAs, were somehow capable of silencing messenger RNA molecules. In the United States, two geneticists, Craig C. Mello and Andrew Z. Fire, adopted the C. elegans model to study this in finer detail. Focusing on the genetic control of a muscle protein that was important in the worm's normal sinewy movement, they injected the gonads with siRNA molecules and watched how it affected the worm's movement. To start with, they broke down the double-stranded siRNA into its two strands, known as “sense” and “anti-sense,” first testing the sense strand—the RNA that exactly matched the original genetic DNA coding—and then the anti-sense strand. Neither had any effect on the worm's movements. Only when they injected both the sense and anti-sense RNA at the same time did something happen. The worm developed an abnormal twitch—the same dysfunctional movement that they saw when the relevant gene was damaged by a mutation.

  This led to the startling realization that these small RNA molecules were capable of taking out specific messenger RNAs. In other words, even after the translation had already taken place—
with the messenger RNA copied from the gene, the introns removed, and the exons spliced together for the final messenger RNA molecule to move out into the cytoplasm ready to code for the protein—these tiny RNA molecules would terminate the process.

  In passing we realize why the Swedish scientists thought it so important to look for viral proteins not merely as messenger RNA transcripts but as the actual expressed proteins within the cells.

  This epigenetic mechanism came to be called “RNA interference,” or “RNAi.” It was yet another mechanism of epigenetic control. The implications were startling. These RNAis recognized key sequences in the specific messenger RNA molecule, so they could home in to inactivate or even to destroy it entirely. In 2006, Fire and Mello were awarded the Nobel Prize in Physiology or Medicine for their discovery.

  From the earliest days of genetics, scientists had focused on what was perceived to be dogma—that genes invariably coded for proteins. Then scientists discovered that this required RNA as the messenger molecule, mRNA. It also required a different type of RNA for the transport of amino acids to the ribosomes, the so-called transfer RNA, or tRNA, as well as a third variety of RNA that was part of the basic ribosome structure, the so-called ribosomal RNA which somehow read off mRNA while translating its coding to the protein. However, in those early years these roles were perceived as secondary, or at best intermediary, to the noble gene-to-protein central axis—the central paradigm. But now we had a fourth type of RNA: these terminator RNAi molecules! Of course, the three different varieties of RNA, other than messenger RNA, must be coded within the nuclear-based chromosomes by sequences of matching DNA. But these coding zones, the DNA sequences that coded for these non-messenger forms of RNA, could hardly be called genes. They did not code for proteins; on the contrary, the RNA molecules they coded for were end points in themselves.

  At this point, geneticists were faced with a dilemma. How were they to classify the genetic sequences that coded for these RNA upstarts? And now that RNAi had been discovered, a number of other small “non-coding” RNAs were further challenging the paradigm.

  Some geneticists toyed with the idea of an “RNA gene,” a gene that coded for an end-point RNA. But others had reservations about the very idea of RNA genes. Whatever the terminological inconsistencies, there could be no doubting the fact that the human genome coded for a surprising variety of RNA molecules that did not code for proteins but nevertheless had important roles to play in the control and expression of genes.

  RNA inhibition by small, non-coding, double-stranded RNA molecules has proved not only important in theory but also of practical usefulness to biologists and geneticists, allowing them to examine the role of specific genes by observing what happens to cells or life-forms when the gene is “knocked out.” The potential for medical therapy is equally important. Some women bear the frightening burden of inheriting one or other of the breast- and ovarian-cancer-associated gene mutations, BRCA1 and BRCA2, while other patients present with the early symptoms of Huntington's disease. All it would take to alleviate the suffering and distress of such patients would be to switch off the relevant mutated genes. Some time in the future, perhaps sooner than we might imagine, molecular geneticists will find a way to do this. Moreover, RNAis are not the sole contribution of RNA to the regulation of genes. A different group of small non-coding RNAs, known as PIWI-interacting RNAs, or piRNAs, appear to be playing an important role in the epigenetic silencing of dangerous viral sequences in the human genome. Moreover, there is another, perhaps even more astonishing class of non-coding RNAs that regulates the human genome, a relatively new discovery that explains that mysterious black hole in the 2001 draft genome—the 50 percent of our human DNA that was left a baffling blank.

  We mammals have evolved sexually differentiating chromosomes, the X and the Y, so that females inherit two copies of the X, one from each parent, and males inherit an X from the mother and a Y from the father. In addition, we inherit 22 non-sex-differentiating chromosomes, called “autosomes” from each parent, making up a total nuclear genome of 46 chromosomes. While the Y chromosome contains an estimated 78 protein-coding genes, largely concerned with testicular development as well as the male physique, fertility, and sperm production, the X chromosome contains roughly 2,000 genes, few of which have anything to do with sexuality. This chromosomal discordance between the sexes led to a potential imbalance in regulation during embryological development. If the sex-linked chromosomes were to be fully expressed during embryological development, female embryos—and females throughout life—would be subject to double the dose of the X-linked genes, while male embryos—and males throughout life—would be subject to a single dose of those same X-linked genes. This could lead to unwelcome regulatory clashes.

  In 1961, Mary F. Lyon, a former pupil of the epigenetic pioneer Conrad H. Waddington, realized that a solution to this developmental riddle might be to switch off one of the two X chromosomes in females. Lyon was duly vindicated when geneticists subsequently confirmed “X-inactivation” in female embryos on about the sixteenth day of embryological development. Curiously, the inactivation does not select for the X chromosome from any particular parent; it appears to choose randomly between the maternally or paternally inherited X chromosomes, and it does not switch off all of the inactivated chromosome, but something more like 60 percent of its genes. The remaining 40 percent are important in protecting females from diseases caused by recessive mutations on the X chromosome. This is why females are rarely affected by color blindness or hemophilia—they would need a double dose of the mutated recessive genes—but males need only a single copy on their solitary X chromosome.

  In 1991, some thirty years after Lyon came up with the idea, scientists working at Stanford University discovered that a single gene on the inactivated X chromosome played a key part in the process of X-inactivation. They called the gene Xist after the role it encoded, as the “X inactive specific transcript.” They also assumed it must work through translating to a corresponding Xist protein. But when they looked for the protein they couldn't find it. This was baffling, since they could trace the gene's expression to the relevant messenger RNA, which was spliced to remove the introns and the exons, which were cobbled together as usual. But the mRNA failed to move out to the ribosomes, where one would expect the protein to be manufactured. I think it might be timely for us to make another exploration on the train, to observe one of the most mind-blowing recent discoveries about our human genome. As we enter the magical landscape, I direct you to one of two similar tracks, running parallel to one another—the two X chromosomes. We have entered the genome of a female fetus on the critical sixteenth day of embryological development.

  We watch cell division taking place within the early embryo, with replication of the genome, the double tracks of the two X chromosomes unzipping along the weak hydrogen bonds of the interlocking jigsaws of the sleepers, to liberate the sense from the anti-sense strands of DNA. The speed of copying is impressive. A blizzard is approaching, though the composite flakes are not snow but the RNA-bound nucleotides G, A, C, and U. As we continue to watch, sections of the sense strand begin to glow in different colors. It is all part of the magic that enables us to make out sequences that mark out genes or promoters or viral sections or sections we currently know diddly-squat about. The process is very similar to what we saw with the coding for a protein, with the stretch of DNA being copied to its matching stretch of messenger RNA, but here the copying appears to go on and on, extending far more extensively than the thousand or so nucleotides we would expect for a single gene. A huge molecule of RNA is being fashioned, comprising some 17,000 nucleotides. It appears to be peculiar also in its intrinsic structure, with the equivalent of genetic full stops, or “stop codons,” at intervals throughout its length. We have never seen anything remotely like this structure before.

  “What is it?”

  “It's a long non-coding RNA—the product of what some geneticists call an RNA gene. The scientific name for
it is Xist RNA.”

  We watch as the RNA molecule flows like a leaking pipe over the targeted X chromosome, changing the gene-activating histone epigenetic marks in a way that hauls together the histone packs into a tight non-coding formation and calling in methylation protein clouds to switch off the cytosine-guanine couplets.

  “What's it doing?”

  “It's switching everything off but not the whole chromosome, just the unwanted 60 percent.”

  Xist was duly recognized as the first of a remarkable new class of epigenetic controllers—what we now call the “long non-coding RNAs,” or lncRNAs. But soon afterward, a second, very powerful, lncRNA was discovered, and it explained another epigenetic mystery.

  Geneticists had already observed that the genome can recognize the specific parental origins of the matching pairs of chromosomes. For example, it could select the specific paternal, or the maternal, chromosome when allowing certain genes, or even whole clusters of genes, to be expressed. This epigenetic mechanism, which is known as “imprinting,” is a key factor in the genetic causation of diseases such as Prader–Willi and Angelman syndromes because it selects a damaged chromosome according to a specific parent of origin even though the chromosome inherited from the other parent of origin is perfectly normal. Geneticists discovered that a key mechanism of imprinting was caused by epigenetic silencing of a whole region of the non-chosen chromosome by another long non-coding RNA, known as Air.

  Inspired by these discoveries, scientists began to search for more of these long non-coding RNA molecules to discover that they are pervasively transcribed throughout mammalian genomes. In time, lncRNAs were duly recognized as part of a newly recognized and very powerful epigenetic regulatory system, giving rise to an explosion of new research. This exciting new venture is still taking place as I write, but already we know that our human genome, like that of all plants and animals, contains vast numbers of long and small non-coding RNAs within which the lncRNAs comprise a class of their own, ranging in size from 200 to more than 100,000 nucleotides long. And what it is revealing, in terms of the coding for these lncRNAs, is at first glance bizarre, but yet also wonderfully logical.

 

‹ Prev