Book Read Free

Life's Greatest Secret

Page 13

by Matthew Cobb


  Back at King’s, Bruce Fraser, like Watson and Crick and Pauling before him, was struggling with a triple helix structure, but this time with the bases on the inside. The model led nowhere. Meanwhile, Rosalind Franklin was finishing up her work on DNA before leaving the lab, groping her way towards a solution, oblivious to what was happening in Cambridge. The progress she made on her own, increasingly isolated and without the benefit of anyone to exchange ideas with, was simply remarkable. In January 1953 she struggled with the data from the Patterson function and complained in her notebook that she could not ‘reconcile nucleotide sequence with Chargaff’s analysis’: her data suggested a cumbersome seven-nucleotide unit that gave approximate 1:1 ratios of purines and pyrimidines, but nothing as precise as some of Chargaff’s data.49 But by 24 February, she had realised that both the A and the B forms of DNA were double helices. She also suggested that the bases on either strand were interchangeable (A with T, C with G), and above all she realised that ‘an infinite variety of nucleotide sequences would be possible to explain the biological specificity of DNA’.50 Franklin was almost there, but she did not have a chance to get any further, because Watson and Crick had already crossed the finishing line.

  By the end of February 1953, Watson and Crick had agreed the basic outline of the double helix model, but this was merely a seductive concept. It needed to be turned into precise numbers, spatial relationships and chemical bonds, in the shape of a physical model. It took a week of calculation and intense work before the double helix, with complementary base pairing between A and T and between C and G, finally emerged from a tangle of precise metal templates held together by clamps. It was a molecular model informed by experimental data; it did not simply emerge from the diffraction data, as Franklin had wanted. On 7 March, Wilkins wrote to Crick announcing that the ‘dark lady’ (Franklin) was leaving King’s the following week, that ‘much of the 3 dimensional data is in our hands’ and promising to launch a ‘general offensive on Nature’s secret strongholds on all fronts’, signing off with ‘It won’t be long now’. When Crick opened the letter on 12 March, the double helix model stood in front of him.51

  3. Double helix model of DNA from Watson and Crick (1953a)

  Wilkins and Franklin came to Cambridge to see the model, and immediately agreed it must be right. Although Watson described Wilkins as being remarkably magnanimous at being scooped, Wilkins recalled that he felt ‘rather stunned’ and bitter and that he made ‘an angry outburst’.52 Whatever the case, it was agreed that the model would be published solely as the work of Watson and Crick, while the supporting data, without which the model would not have existed, would be published by Wilkins and Franklin – separately, of course. The first public announcement of the discovery was made by Sir Lawrence Bragg in April, at a conference in Solvay, Belgium. Pauling, who had seen the model in Cambridge and given it his blessing, told the audience that the Watson and Crick model was ‘very likely’ to be ‘essentially correct’, and that his triple helix was mistaken.53 On 25 April there was a party at King’s when the three articles were published in Nature. Franklin did not attend. She was now at Birkbeck and had stopped working on DNA.

  *

  Franklin was never told the full extent to which Watson and Crick had relied on her data to make their model; if she suspected, she did not express any bitterness or frustration. In subsequent years she became very friendly with Crick and his wife, but she was never close to Wilkins or Watson, although she interacted with Watson as she worked on the structure of the tobacco mosaic virus.54 Franklin died of ovarian cancer in 1958, four years before the Nobel Prize was awarded to Watson, Crick and Wilkins for their work on DNA structure.

  In 1968 Jim Watson published The Double Helix, in which he gave a gripping but partial account of events and a frank description of his own bad behaviour, particularly with regard to Franklin. The epilogue to the book contains a generous and fair description of Franklin’s vital contribution and a recognition of his own failures. The Double Helix also suggested that Max Perutz had given Watson and Crick a confidential document when he handed over the MRC report containing those vital paragraphs by Franklin. Perutz was hurt by this allegation and showed that it was not true. There can be no doubt that the data in the report provided Crick with the insight he needed to come up with the correct structure, but the document was not confidential, and above all Franklin had publicly communicated the essential results nearly eighteen months earlier. Watson had been in the audience, but he had not understood the significance of what was being said.55

  One of the main facts that in retrospect seems so obvious, but which was not at the time, is the role of what are sometimes called the ‘Chargaff rules’ – the fact that the amounts of A and T and of C and G are equivalent. These ratios were not known to be so precise at the time and they were certainly not ‘rules’. As Jerry Donohue, who shared Watson and Crick’s room at the Cavendish laboratory, later recalled in somewhat exaggerated fashion:

  When the final model of DNA was discovered – more or less by accident – it wasn’t Chargaff’s rules that made the model, but the model that made the rules.56

  The Watson and Crick paper in Nature concluded coyly, ‘It has not escaped our notice that the pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.’57 This was the solution that Watson had been searching for – the complementary pairing of the bases gave a potential insight into gene duplication; with a single molecule it was possible to create two identical daughter molecules, simply by copying each strand using complementary pairing. Even more significant was what the three Nature papers did not say – none made any reference to how genes functioned, or the significance of the sequence of bases. There was still no genetic code.

  * The rap contest can be seen at http://www.youtube.com/watch?v=35FwmiPE9tI.

  –SEVEN–

  GENETIC INFORMATION

  On 19 March 1953, about two weeks after the double helix model had been completed, Francis Crick wrote a letter to his 12-year-old son, Michael, who was at boarding school. Crick told Michael what he had discovered, and included a sketch of the structure of DNA. He then went on to explain the significance of the double helix:

  It is like a code. If you are given one set of letters you can write down the others. Now we believe that the D.N.A. is a code. That is, the order of the bases (the letters) makes one gene different from another gene (just as one page of print is different from another).1*

  Although the idea that the sequence of bases might be the source of genetic specificity had been in the air for some time, this was the first time that anyone had said that DNA contains a code, and Crick’s son Michael was the first to read it. In 2013 the letter was sold at auction for $6m.

  Crick went even further in the second paper he published in Nature with Watson, which appeared on 30 May 1953. Like their first publication, this article contained no data at all – it was purely theoretical. As the title explained, its aim was to explore the ‘genetical implications of the structure of deoxyribonucleic acid’.2 The article began by suggesting that DNA was ‘the carrier of a part of (if not all) the genetic specificity of the chromosomes and thus of the gene itself’. This ‘big picture’ was absent from all three DNA papers published in April, which dealt solely with the structural chemistry of the molecule, not its function, with the exception of the coy closing phrase ‘it has not escaped our notice …’. Much of Watson and Crick’s second article was devoted to expanding on that cheeky insight. They described how, during gene duplication, the double helix could unwind, with each chain forming the template for the construction of a new molecule, leading to the creation of two identical daughter double helices.3

  In the middle of this discussion there was a half-sentence, almost a throwaway remark, that echoed the terms used in Crick’s letter to his son, but which expanded them to a far broader conception and propelled biology into the modern age:

  The phosphate-sugar backbone of ou
r model is completely regular, but any sequence of pairs of bases can fit into the structure. It follows that in a long molecule many different permutations are possible, and it therefore seems likely that the precise sequence of the bases is the code which carries the genetical information.

  With the exception of Ephrussi and Watson’s unfunny satirical letter to Nature seven weeks earlier, this was the first time that the content of a gene had been described as information. It is not known where the phrase ‘genetical information’ came from – Watson recalls that Crick wrote most of the manuscript in less than a week at the end of April, and the form and the content are more typical of Crick’s style than of Watson’s.4 Given the growing presence of ‘information’ in scientific articles from a wide range of disciplines, and the popular interest in communication theory and cybernetics, it seems likely that the idea just came naturally to Crick, as part of the zeitgeist. There is no indication that either Crick or Watson had read Shannon or Wiener, or that they were using the term in explicit reference to their mathematical ideas. As with the word ‘code’, ‘information’ seems to have been used as an intensely powerful metaphor rather than a precise theoretical construct.

  In the past, scientists had spoken of genetic specificity, but with the introduction of the idea that the DNA sequence contained ‘the code which carries the genetical information’ a whole new conceptual vocabulary became available. Genes were no longer mysterious embodiments of specificity, they were information – a code – that could be transmitted (another word from the electronic age), and the central hypothesis was that the code was composed of a series of letters – A, T, C and G. How exactly that code might function, what it might represent, was not stated. Nevertheless, the words used so lightly by Crick and Watson changed the way in which scientists could speak and think about genes. Eventually this new vocabulary contributed to the development of novel parallels between genes and electronic communication and processing.

  None of this was obvious at the time. Watson was not particularly keen on the article being published at all, as he explained to Delbrück in May:

  Crick was very much in favour of sending in the second Nature note. To preserve peace I have agreed to it and so it shall come out shortly.5

  Wilkins was not impressed, either. He recalled: ‘Some of Francis and Jim’s friends, however, thought the second paper was rather “going over the top”.’ Among those friends was Wilkins himself.6 Despite these doubts, the advantage of Crick’s approach was that it explicitly set out the two revolutionary implications of the double helix: complementary base pairing explained gene duplication, while the sequence of bases explained genetic specificity. Here were two hypotheses that would revolutionise biology, if they were true.

  Crick recognised the links between the discovery of the double helix and Schrödinger’s ideas from a decade earlier. On 12 August 1953, he sent copies of the two Nature articles to Schrödinger, accompanied by a brief letter:

  Watson and I were once discussing how we came to enter the field of molecular biology, and we discovered that we had both been influenced by your little book, ‘What is Life?’

  We thought you might be interested in the enclosed reprints – you will see that it looks as though your term ‘aperiodic crystal’ is going to be a very apt one.

  *

  In July 1953, Watson and Crick received a strange letter from the US. Handwritten in big letters on headed notepaper from the University of Michigan students’ union, the letter was full of crossings-out and spelling mistakes and looked as though it was from a crank. In fact it was from the Russian-born cosmologist George Gamow (pronounced ‘Gam-off’), who was a long-time friend of Max Delbrück and had chaired the 1946 meeting on ‘The Physics of Living Matter’.7 Although he was an expert in nuclear physics, Gamow had not passed security clearance for the Manhattan Project and had not been involved in the development of the atomic bomb at all. FBI surveillance of Gamow continued after the war, and he was interviewed by them as late as 1957, although they never found any evidence against him.8

  Gamow was an eccentric 50-year-old who liked his whisky and had a sideline in popular science books based around an everyman character called Mr Tompkins. In his odd letter, Gamow seized upon Watson and Crick’s suggestion that the sequence of bases contained a ‘code’, and audaciously tried to come up with ways of cracking that code. Gamow’s starting point was that each organism could be characterised by ‘a long number’, which corresponded to the number of positions in the DNA sequence. He then dismissed decades of research in classical genetics showing that genes are located in definite positions on chromosomes, and argued that it seemed ‘more logical’ if genes were instead ‘determined by the different mathematical characteristics of the entire number’. In an attempt to make his idea clearer, Gamow wrote, with his typically erratic spelling:

  the animal will be a cat if Adenine is always followed by cytosine in the DNA chain, and the characteristics of a hering is that Guanines allways appear in pairs along the chain … This would open a very exciting possibility of theoretical research based on mathematics of combinatorix and the theory of numbers!9

  Gamow said he would be in England in the autumn, and asked whether they could meet. Watson and Crick were both about to leave Cambridge and pursue their separate careers – Watson was going to Pasadena, while Crick was headed for Brooklyn Polytechnic, once he had finished his PhD on the structure of haemoglobin. So the pair simply ignored Gamow’s letter.10 Or, rather, they did not reply to it. Crick did not ignore it: Gamow had planted an idea that would not go away.

  Gamow did not give up easily. Over the next couple of months he worked up his ideas about the genetic code and in October he sent a brief note to Nature, which was published in the following February. He tried to publish a longer article on the same subject in Proceedings of the National Academy of Sciences, co-authored by his fictitious character Mr Tompkins. The Editor of PNAS spotted the jape and was not amused, so Gamow sent the article to the Royal Danish Academy, with Tompkins’s name excised.11 Gamow addressed the link between the DNA code and proteins by pointing out that the central question was how four-digit ‘numbers’ in the gene (A, C, T and G) were translated into an amino acid ‘alphabet’ in a protein.

  Gamow’s answer was ingenious and was not dissimilar to the template idea that Caldwell and Hinshelwood had published three years earlier. Gamow assumed that proteins were synthesised directly on the DNA molecule, so that the shape formed by the bases as the DNA molecule twisted round acted as a kind of template upon which the amino acids were arranged. Because of the spiral shape of DNA, there would be a diamond-shaped ‘hole’ between different rows of bases; the four bases on each side of this diamond therefore constituted the code.

  4. Gamow’s ‘diamond’ model of the genetic code, from Gamow (1954). The round structures numbered 1–4 are the bases, the diamond shapes labelled a–t are the 20 naturally occurring amino acids.

  Gamow noted that there were twenty different possible kinds of ‘hole’ and continued, ‘it is inviting to associate these “holes” with twenty different amino acids essential for living organisms.’ Gamow even came up with a prediction that would test his model: because each base contributed to the shape of the ‘hole’ of more than one amino acid, ‘there must exist a partial correlation between the neighbouring amino acids in protein molecules, since the neighbouring holes have two common nucleotides.’12 By treating the code as a mathematical problem rather than a biological one, Gamow was opening the door to years of speculation about the nature of the genetic code. He was also committing the classic physicist’s error of assuming that living systems are designed according to elegant, logical principles that can be revealed by mathematics. In fact they are historical, carrying the baggage of their evolutionary past, and have not been designed at all. They are often far from logical, nor are they generally elegant. They work, and that is enough.

  Gamow sent Crick a copy of his paper and eventually the two me
n met in Brooklyn in December 1953. Crick’s office-mate Vittorio Luzzati recalled:

  It was amazing. These two spirited men debated, argued and fought their way through the subject of the code disposing of issues, one after another, in their exuberance their voices rising to shouts.13

  Crick was not convinced by Gamow’s ideas – for a start, he did not think that protein synthesis took place on the DNA molecule, as Gamow assumed. It was by now well known that DNA was found in the nucleus, as part of the chromosomes, whereas RNA was found freely in the cell, where protein synthesis took place. Crick and Watson, following Caspersson, Brachet, Boivin, Vendrely and Dounce, considered that RNA acted as an intermediary between gene and protein. This was the meaning of the equation DNA → RNA → protein. The very starting point of the diamond code was wrong.

  But Gamow had put his finger on a fundamental and seductive issue: the potential relation between the number of naturally occurring amino acids (twenty) and the number of possible combinations in the code formed by the bases A, C, T and G. As Gamow immediately realised, if the code were composed of two-letter ‘words’ (AA, AT, AC, AG, etc.), there would be sixteen possible combinations – not enough for each ‘word’ to correspond to a different amino acid. But if the code were composed of three letters (AAA, AAT, AAC, etc.), there would be sixty-four possible combinations – more than enough.

 

‹ Prev