by Bryan Sykes
Whereas this was once thought to be a completely random process, and that DNA exchanges could happen anywhere along the length of the chromosomes, it turns out that this is not so. It now seems that there are “hot spots” along each chromosome where these exchanges are much more frequent. Rather than being completely random, as in a properly shuffled deck of cards, it is as though there are runs of cards that stay together.
The blocks of DNA between hot spots that are not disrupted by shuffling can be tens of thousands of bases long and contain several SNPs. This means that they tend to retain the combination of variants at each of the SNP sites within them. So a block with five SNPs might have the combination, as seen by the chip, of red/green/green/red/red, each one indicating the presence of a particular variant at the SNP site. This introduces a new level of discrimination, as there are now 25, or 32, possible combinations for this segment. This is the same principle that provides for the enormous range of genetic signatures generated by only a few markers on the Y chromosome when they are used in combination. Although the situation on the autosomes is far less helpful than on the Y chromosome, not least because exchanges are not exclusively confined to hot spots, the presence of relatively undisturbed segments of DNA is nonetheless valuable for the next stage in the chromosome painting process.
After the Human Genome Project finished in 2003, there were a lot of geneticists looking for something to do, and a lot of idle machinery. Some of them plowed on with sequencing other genomes, first mouse, then chicken, and so on. They are still going and, predictably, the species being sequenced are becoming more exotic. In 2011 the complete DNA sequence of the nine-banded armadillo and the canary were on their way to completion, in company with multitudes of potentially useful bacteria and fungi.3
Other geneticists switched their researches to studying the DNA variation among individual human genomes and soon began to realize that the human genome was falling into blocks. Thanks to the discovery of DNA-exchange hot spots and the cooler regions in between, a huge international scientific effort to describe these blocks as fully as possible began to take shape in 2002. How many there were, where the boundaries were, and so on. The impetus and the large sums made available were driven by the optimism of finding the elusive common disease genes, the “Napoleons of Crime.” By knowing where these blocks were, it was going to be easier to locate these genes by the simple strategy of association between the blocks and the presence or absence of the disease in question in large numbers of patients and controls. Where the association with a particular block was high, then Macavity must be hiding nearby. Surely?
To discover how these blocks were behaving the HapMap Project (after “haploblocks,” as these chunks are known) looked in detail at the genomes of individuals from three different parts of the world.4 The chosen ones, 270 in all, came from Africa, Asia, and Europe, and each individual’s DNA was typed for about 3 million SNPs. The ninety-strong African contingent was from Ibadan in Nigeria, members of the Yoruba tribe; the ninety Asian volunteers were from Tokyo and Beijing; while the ninety Europeans were actually Americans with their roots in northern and western Europe. The work was divided up among labs in the United States and Canada, England, China, and Japan with each lab concentrating on different chromosomes, as they had in the initial sequencing of the human genome. Many of the HapMap scientists were veterans of the Human Genome Project and knew their way around their favorite chromosomes. From these results half a million SNPs that were favorably placed within each block were selected, and these were put on a DNA chip. Like the Human Genome Project, one attractive feature of HapMap was the release of data into the public domain, and it was through this release that software engineers were able to get their chromosome brushes out and start painting.
The aim of chromosome painting is to assign each block in an individual’s genome to one of the three continental origins represented by the HapMap volunteers. This is of course a gross simplification, but it seems to work. Let us take President Obama as an example, not that I have his details. (I was told that the president has declined to be tested while he is still in office.) I may not have his chromosome painting, but I have a pretty good idea what it would look like. As everyone knows, President Obama has an African father and a European American mother. He has inherited one chromosome of each of the twenty-two autosome pairs from his father, with DNA blocks that will likely match up with the Nigerian volunteers more than they do with the Asians or the Europeans who helped build up the three continental reference collections. Equally, his other chromosome in each pair has come from his mother, whose DNA blocks will probably all match the European more closely than either the Asian or African chromosomes. The painting software makes these comparisons for blocks of DNA of about ten thousand bases all along the twenty-two pairs of chromosomes. With a total of 3 billion DNA bases to cover, this makes a total of about thirty thousand blocks to color in.
Mike MacPherson, the scientist who helped develop the program, explained to me on my visit that there are six possible combinations for each of these blocks along the chromosome pairs: African/African, Asian/Asian, European/European, and then the combinations of African/European, African/Asian, and Asian/European. Mike’s algorithm chooses which of these combinations fits best with the DNA being analyzed and fills in the painting accordingly. “African” blocks are colored a light green, “Asian” blocks are orange, and “European” blocks are dark blue. As each block is analyzed and painted separately, there is a set of conventions that govern the coloring of the top and bottom slices. When both copies of a block have their best match with only one of the reference samples, as in African/African, for example, then both top and bottom slices are painted green. The rules come into play when the two blocks match different reference samples. So for African/European blocks the top slice is painted green for African, and the bottom slice is dark blue for European. An African/Asian block has the Asian orange on the top slice and African green underneath. The third mixed block, Asian/European, has blue on top and orange below. There are examples of all of these in the illustrations for chapter 19.
Returning to my theoretical reconstruction of the president’s chromosomes, given that his father, Barack senior, was from Kenya and his mother, Ann Dunham, was a European American from Kansas, I would expect the chromosome pairs in his body cells to be African green on top, following the convention mentioned above, and European blue beneath pretty well as in the monochrome version in Figure 4 (A). Chromosomes don’t actually look like this in real life. This is just a diagram of one of them, but it does give me the opportunity of pointing out one of the features of chromosomes. They are divided into two arms, separated in the diagram by the gray disc. The discs are there to represent attachment points for muscle-like proteins that help to pull the two chromosomes apart during cell division. Although the attachment points are made of DNA, their sequences are very repetitive and hard to analyze and consequently have not been included in the HapMap coverage or the SNP chips and are shaded gray in color versions. There are one or two other gray regions that have been left off the chips for the same reason, but they are only small and we can forget about them.
Figure 4. Following one of President Obama’s chromosomes through four generations. The light gray blocks are DNA of African origin, while the darker blocks have a European origin.
The president’s children have inherited one of each pair of chromosomes from him and the other from their mother, Michelle, the first lady. The chromosome coming from the president (B) is an amalgam of the two chromosomes in his body cells (A), shuffled by DNA exchange. There is usually only one exchange on each chromosome arm at each generation, so the chromosome going to his first daughter, Malia Ann, in Figure 4 (C) might look like this, although the random nature of DNA exchange makes the precise pattern unpredictable.
We know from conventional genealogical research carried out by Megan Smolenyak and reported in the New York Times on October 7, 2009, that Michelle Obama has some Eu
ropean ancestors. However, for the sake of simplicity, we will ignore these and assume that all her ancestry is African. So the example chromosome in Malia’s body cells would look like C, with the mixed African/European chromosome (B) from the president and an African chromosome from the first lady.
Looking into the future, to the time Malia Obama has her own children, the chromosome she passes on will be another amalgam of the two she inherited from her parents, randomly shuffled by DNA exchange, like D in Figure 4 perhaps. If, to keep it simple, she marries a man with an African genome, her child—let’s say it is a boy this time—will have arrangement E in his body cells. Most of the DNA in this pair of chromosomes has an African origin, all except for the European DNA in the dark blocks that have come, originally, from his great-grandmother, Ann Dunham. Sure, we can give a percentage of African and European DNA in this pair of chromosomes (roughly 88 percent African and 12 percent European by the look of it), and all the other chromosomes once we have “interrogated” them on the DNA chip. But in my view that doesn’t take us much further than the ethnic ancestry tests derived from the AIMs. What really distinguishes chromosome painting from its forerunner is that, since we know precisely where genes are located on each chromosome, we can tell the continental origin of each one in any individual.
In the president’s theoretical grandson—let’s call him Harry—most of the genes along this chromosome will have an African origin, but for genes located within the two-tone blocks, he will be working on a fifty-fifty combination of African and European genes. If, for example, the gene for the ABO blood group was in one of these blocks, then his blood group will be decided by a mixture of African and European DNA. If the block contained a muscle protein gene, his muscles would be powered equally by African and European genes. Since both the size and boundaries of these blocks is so random, unless they happen to be identical twins, it is extremely unlikely that any two of the president’s grandchildren will inherit the same blocks of European DNA, and hence the same European genes, on this chromosome. When all the chromosomes are brought into the comparison, then what was vanishingly unlikely becomes virtually impossible, and—though each of the president’s grandchildren may have close to the average of one-eighth European DNA that is expected—the number and the identity of the genes with a European ancestry will be quite different in all of them.
By the time of the next generation, assuming once more for simplicity that Harry marries an African, his child, the by-now-former president’s great-grandchild will have only one small segment of European DNA on the chromosome we have been following. It came originally from the president’s mother and has survived through four generations, diminishing by roughly half at every one. It may survive for many more generations to come, or it may be eliminated by the forces of random chance at any one of them.
We have followed only the ancestry of one chromosome through three generations, and even then we have assumed that the chromosomes that joined the genealogy from outside are of entirely African ancestry. As you can imagine, where these incoming chromosomes are themselves built up of blocks of DNA with different continental ancestries, the picture soon becomes very complicated. But, however intricate it is, we would still be able to recognize the ancestral origin of the blocks of DNA and identify the genes that were contained within each one of them. I liked the way the chromosome portraits got so close to the actual situation and illustrated it so well. Our genomes are all mixtures built up of bits and pieces from a huge number of ancestors, and when these ancestors came from different continents, the variety is both obvious and intriguing. This was what I wanted to explore in America, and as soon as I returned to England from San Francisco, I began to plan in detail how to go about it.
SECOND MOVEMENT
12
The New Englanders
Col. William Prescott, Bunker Hill Memorial, Boston, Massachusetts.
I realized at the start that this was going to be very different from all the genetics projects I had ever done. Previously, in my research into the genetic history of Polynesia, Europe in general, and most recently Britain and Ireland in particular, I had started with the presumption that I needed to study large numbers of people. Partly that was in order to satisfy the statisticians that guard the entry gates to respectable scientific journal publication and for whom nothing is true unless it is also statistically significant. They miss an awful lot. Second, I didn’t know from the start what was out there, and in limiting numbers I might have missed a vital genetic ingredient. This last point has turned out to be important, as I have been able to pick out the small proportion of people whose DNA seems completely out of place but which are the echoes of historical events. Events like the introduction of African slaves during the Roman occupation of Britain, or the Korean genes that have washed up on the shores of the North Atlantic. Very unusual DNA that might well have been missed in less substantial surveys. In America, however, this would have been impossible. It would have cost millions to use the same approach, and anyway would only replicate what others had already done. I also have to admit to being rather reluctant to write another genetic history along the same lines I had already followed in other parts of the world. Besides, there were issues I wanted to look into that would not have interested purely scientific publications.
The whole project began, in my own mind, to assume the character of the “last big job,” much as bank robbers are said to look forward to just one final payoff before hanging up their weapons. And always getting caught. For my “last big job” I decided on a completely different approach. Instead of the comprehensive and detailed planning that had gone into earlier research projects, I decided this one would be guided by chance events from the start: I would just see where they led. I think I might have been influenced by seeing Easy Rider again for the first time in years in which the characters played by Peter Fonda, Jack Nicholson, and Dennis Hopper wander aimlessly around the southwestern United States on their way to New Orleans for Mardi Gras. Or maybe it was the image of “Shoeless Joe” Jackson materializing in the cornfields of Iowa in Phil Alden Robinson’s movie Field of Dreams starring Kevin Costner. “If you build it, they will come.” Freed from having to please the stats police, since I had no need or ambition to publish elsewhere than in DNA USA, I would take it easy, travel around, and just see what happened.
Though I had been to America on many occasions I didn’t really have a sense of how big it really was. Like many Europeans, my experience had been more or less confined to the east and west coasts, with a couple of days in Chicago. Of course I knew it was big, but not how big. I knew it lay somewhere between the extremes of a country whose dimensions I was used to, namely Britain, and infinity. I didn’t have any real feeling for where America was on this scale. To put this right, I decided to travel coast to coast by train, at least in one direction.
My son, Richard, just eighteen, agreed to come with me for the first leg east to west. Of all my research assistants, both paid and unpaid, Richard has been on more DNA expeditions than any other, and from a very young age. He was only six years old when he came with me to Scotland and toured the blood donor clinics. He was twelve when we set off for a three-month tour of Australia, New Zealand, and Polynesia. This time I knew would be the last. He was about to leave home for college. The three weeks we would spend in America would be the last of many long adventures together, after which I knew things would never be quite the same again. On the return leg across America I would be joined by Ulla, whose natural effervescence was to prove invaluable when it came to recruiting volunteers.
I would like to say that we set off that September with the easy confidence and optimism that are essential for any road movie remake. But that was far from reality. When Richard and I arrived in Boston it was cold, gray, and pouring with rain. City rain—falling not with anger or finesse, but just dripping from the sky as though a damp gray sponge had been suspended from unseen pinnacles low across the city. The sullen drops fell languorousl
y onto the metal window ledge of the downtown hotel, tapping out a monotonous rhythm in 4/4 time. The next morning it was still raining. The task I had set myself seemed overwhelming and without limit, mocking my conceit that I could ever tackle such a gargantuan undertaking as writing a genetic biography of the United States of America.
By the next day the skies had cleared, and Richard and I set off for our first appointment. We walked across Boston Common, past the lake, where weary boatmen propelled their swan boats brimful of visitors around a figure-eight course. The first colors of fall were touching the elm trees, their farthest leaves a pale yellow under a bright blue sky. The sun, by now well above the horizon, reflected off the golden dome of the Massachusetts State House near the top of Bunker Hill. We were heading for the affluent streets off Commonwealth Drive and the headquarters of the New England Historical Genealogy Society in Newbury Street, the oldest and best-known genealogy society in America. We passed shops that signaled the discreet wealth of the neighborhood: Chanel, Valentino, Diane von Furstenberg, past carefully choreographed climbing plants, their tendrils twined around iron railings. The headquarters building of the society had once been a bank and still retained the formal grandeur of its earlier life. Great bronze doors lay behind a cluster of clipped green bushes, their austerity mitigated by a Visitors Welcome sign. Inside we found ourselves in a large oak-paneled hall, a sumptuous chandelier dripping from the high ceiling, with books and portraits of early New Englanders lining the walls. If there was ever a place to immerse myself in this region and its people, 101 Newbury Street was it.