DNA USA

Home > Science > DNA USA > Page 8
DNA USA Page 8

by Bryan Sykes


  The population grew steadily but not spectacularly, so that by the time of the Declaration of Independence in 1776, somewhere between five hundred thousand and a million people, mainly from Britain and northwest Europe, had crossed the Atlantic to begin a new life in America. Many arrived as indentured servants, working for several years to repay their employers the cost of their passage. When they had worked off this debt they were, in theory, granted land of their own, although this often did not materialize. The beginning of the nineteenth century saw a very large expansion in the number of immigrants from northern Europe following the Louisiana Purchase of 1803, when Thomas Jefferson engineered the acquisition of territories originally claimed by France west of the Mississippi for $15 million ($220 million today). The transcontinental expedition of 1804–6, undertaken at Jefferson’s request by Meriwether Lewis and William Clark, set off from St. Louis to explore the new territories along the Missouri and farther west in what were to become Washington and Oregon. As well as its scientific and geographical mandate, one objective of the expedition was to establish a claim over these northwest territories before the British, who were operating out of Canada.

  Through statutes like the Homestead Act of 1862, successive US governments encouraged large-scale immigration and settlement of the West, partly to discourage the British. Coupled with industrial growth in the Northeast, this territorial expansion attracted more than twenty-five million Europeans, mainly from northern Europe, to emigrate to the United States by the end of the nineteenth century. At the turn of the twentieth century, Europeans were still arriving in large numbers, but mainly from southern and eastern Europe, and this trend continued until the Immigration Act of 1924, which attempted to limit immigration of Jews, Italians, and Slavs by imposing national quotas. European immigration slumped during the Great Depression, but the quotas were retained, eventually becoming one factor that prevented many Jews from escaping Nazi persecution by emigrating to the United States. National quotas were abolished in 1965, since when the great majority of immigrants have originated from outside Europe. Even so, a glance at the population ancestry map of the United States following the 2000 census, which plots the population origin of the majority residents in each county in each state, shows that Europe is still the ancestral origin for most Americans. (See map in color insert section.)

  The effects of large-scale immigration during the nineteenth century are very clearly reflected on the map, with twenty of the forty-eight states in the contiguous United States (that is, omitting Alaska and Hawaii) drawing their largest population from Germany. Italian Americans predominate in New York State, Irish Americans are the largest in Massachusetts and New Hampshire, while Britons are the majority origin only in Vermont, Connecticut, and, through their Mormon origins, Utah. However, in Arkansas, Tennessee, Kentucky, and West Virginia the largest number self-declared as “American,” a category that no doubt encompasses many with ultimately British roots. In four southwestern states with historical ties to the early Spanish territories—California, Arizona, New Mexico, and Texas—the greatest number declared Mexican origins. In seven southern states—Louisiana, Missouri, Alabama, Georgia, North and South Carolina, and Virginia—the largest proportion declared themselves as African Americans.

  In no states were American Indians in the majority, but when the states were subdivided into counties, they did make up the largest proportion of the population in parts of Arizona and New Mexico, Oklahoma, South Dakota, and Montana. These higher-resolution plots also showed that the overall German dominance in the northern half of the United States was punctuated by counties with their greatest proportions drawn from other European countries. Norwegians dominate in parts of North Dakota and Minnesota, while Finns are in the majority in some regions of Michigan bordering Lake Superior. In a few counties within southern Michigan and Iowa, Dutch Americans are in the majority, while Americans of French origin predominate around their former colony in New Orleans. Perhaps unexpectedly, there is an Irish majority in two counties in southern Washington State and around Butte, Montana.

  Apart from the numerical dominance of European Americans in large swaths of the United States, the other, subtler feature of the map is that so many respondents in the 2000 census knew what their ancestral origin was, at least sufficiently to declare it as such on the census form. This reflects a key feature of European American ancestry, which is that almost everyone either knows what their European origins actually were or has a strong belief and attachment to them. In complete contrast to Native Americans, there is no general uncertainty surrounding the origins of European Americans and therefore no pressing need to recruit genetics to help sort these out.

  Among the potential exceptions are the questions surrounding the fate of the vanished colonists from Roanoke Island; whether or not the original Viking explorers who landed in Newfoundland left any descendants; and last, whether other European expeditions to the New World predated Columbus and left genetic evidence behind. I have been asked to look into all three of these puzzles in the past, but I have not undertaken any of them because I think there is very little chance of a clear result. It is not that there is any difficulty in distinguishing Native American from European DNA. That is easy. The difficulty lies in the sheer numbers of European immigrants who followed on the heels of these early arrivals. This makes it, in my view, quite impossible to be sure that, for example, any Norse origin DNA found among the modern-day inhabitants of Newfoundland could be confidently attributed to Leif Erikson’s men rather than to more recent Scandinavian immigration. The same applies to English DNA around Roanoke in North Carolina or to Portuguese genes in the Caribbean. That isn’t to say it can never be done—the best route would be the recovery of DNA from well-preserved and well-dated human remains—but not at present by comparisons with the modern populations.

  Nevertheless, while very few general mysteries surround the ancestral origins of European Americans, there is no limit to individual curiosity that, amplified by the importance attached to these origins reflected by the census returns, has given genetics plenty to do over the course of the last decade. When it comes to European Americans it is genealogy rather than anthropology that has gained by involving genetics. And while the research on the origins of Native Americans largely relied on mitochondrial DNA, as we have seen, genealogical research has involved the other important witness to the past, namely the Y chromosome. While mDNA tells the story of women, the Y chromosome is a chronicler of the behavior of men.

  This intriguing piece of DNA resides in the cell nucleus rather than the cytoplasm and is one of the twenty-four different chromosomes that together make up the human genome. But, while it has some of the properties of its other chromosomal companions in the nucleus, it is, by its very nature, an outsider. It travels alone through the generations without exchanging DNA with any other chromosome. Only men have Y chromosomes, and the reason for this is both straightforward and utterly fascinating. After fertilization, all human embryos start off as female and, if nothing happens to change their development, are born as baby girls. However, the Y chromosome contains a single gene that, when activated at about six weeks after conception, diverts the embryo from a female to a male trajectory. How it does this is complex and not completely understood, but the end result is straightforward: An embryo with a Y chromosome turns into a boy, while an embryo without a Y chromosome continues developing into a girl.

  When they set out to fertilize an egg, half of the sperm contain a Y chromosome and the other half do not. If the fertilizing sperm is one of the 50 percent with a Y chromosome, then the child is a boy and, conversely, if the fertilizing sperm does not have a Y chromosome, the baby will be a girl. This is why approximately half of babies are boys and half are girls. Thus, Y chromosomes are passed on exclusively from fathers to their sons.

  While it is the Y chromosome that has caught the imagination of genealogists, many European Americans also have an interest in their maternal ancestry, which is track
ed very well by mitochondrial DNA. As we have seen, there are five clans among Native Americans, four of them, Aiyana, Ina, Chochmingwu, and Djigonase, from Asia and one, Xenia, from Europe. Xenia is one of the seven predominant maternal clans my research team and I first uncovered in the late 1990s and which formed the basis for The Seven Daughters of Eve. The cluster date for Xenia in Europe we estimated to be twenty-five thousand years, but that was an uncalibrated date not taking account of the founding lineages. My colleagues Martin Richards and Vincent Macaulay then undertook the Herculean task of combing European and Middle East data for mDNA matches in all the clans that would identify likely founder sequences, after which they recalculated the cluster dates. We published the results in 2000, and it was a relief to find that they were largely in line with our earlier uncalibrated estimates and reinforced our conclusion that the majority of the clan mothers were in Europe, as Paleolithic hunter-gatherers, before the arrival of Neolithic farmers from the Middle East about ten thousand years ago.1 You can find these calibrated cluster dates in the appendix. As usual with adjusted dates, they are younger than their uncalibrated equivalents because eliminating founder sequences reduces the average number of mutations in the remainder.

  One difference is the replacement of one clan (Velda) by another (Ulrike) in the ranking of the seven most prolific European clans. This is due to the extended geographical range of the analysis that took in new results from eastern Europe that had not been available in our first analysis. Velda is very much concentrated in western Europe while Ulrike is more common farther east. Another difference is that we recognized a major branch of the Tara clan (T2) that had a considerably younger date than the age estimates of the clan as a whole. This branch is probably of Neolithic rather than Paleolithic origin and joins the clan of Jasmine in entering Europe with the introduction of agriculture.

  All these clans are well represented among European Americans, exactly as expected. Moreover, clan frequencies in the United States are much the same as they are in the regional source population in Europe. Since the ancestors of all European Americans arrived within the last five hundred years, and many far more recently than that, there has been very little time for new mutations to accumulate among their descendants living today. Even if the ancestors of all European Americans had arrived with Columbus five hundred years ago, the average mutation rate of one change every twenty thousand years means that that only one American in forty (20,000/500) would have a sequence that differed from his or her European ancestors. For Americans in search of their maternal relatives in Europe this has the advantage of limiting their search to only those individuals with exactly matching mDNA sequences.

  Like mDNA, the Y chromosome groups individuals into clusters with a common ancestor, the only difference being that these clusters are patrilineal rather than matrilineal. Although Y chromosome DNA (or yDNA from now on) is much longer than mDNA, one type of variation is the same for both: That is the differences in sequence introduced by faulty copying, whereby one base is substituted for another. However, the rate of mutation is about twenty times slower in yDNA because, as with the other nuclear chromosomes, there is a very efficient quality-control mechanism that identifies copying errors and eliminates most of them. Mitochondria don’t have this error-checking mechanism, and so mutations have a better chance of getting through to the next generation. It was a real struggle to identify the yDNA sequence changes, but the hard work has been done, principally by Peter Underwood and his team at Stanford University, and we now have a good range of markers that are capable of distinguishing thousands of different Y chromosomes from one another. Because of their slow mutation rate these markers, which go by the acronym of SNPs (for “single nucleotide polymorphisms” pronounced “snips”) have each probably changed only once during the course of human evolution and so are very useful for plotting out the overall Y-chromosome family tree. However, they are not very much use to genealogists interested in much more recent time frames.

  Fortunately help was at hand because a completely different type of yDNA marker was discovered in the mid-1990s and looked as if it was tailor-made for genealogists. The relevant acronym here is VNTRs, standing for “variable numbers of tandem repeats,” which sound more complicated than they really are. VNTRs are segments of DNA composed of repeating blocks of DNA sequence usually three to five bases long. They are very boring to read through, for example, AGTAGTAGTAGTAGTAGTAGTAGTAGTAGT where the triplet AGT is repeated ten times. What makes them interesting is that the number of times the triplet is repeated varies between different Y chromosomes. It is relatively easy to measure the length of these variable segments and so work out how many repeats any Y chromosome contains. In this example, instead of ten repeats, there might be anywhere between eight and twelve on different Y chromosomes. This is an excellent way to distinguish chromosomes that would otherwise be impossible to tell apart using the SNP system. And because Y chromosomes are the outsiders of the cell nucleus and don’t talk to or exchange DNA with their companions, they remain completely intact. This unsociable behavior has a tremendous advantage because it means that all the markers along the whole chromosome (well, barring the very ends, which we can ignore) are inherited together from one generation to the next. This in turn means that results from one marker can be combined with another and the effect is multiplied. So if there are five different repeats at one marker, like our example, and five at another, the number of possible and distinguishable combinations is not 5 + 5 = 10 but 5 × 5 = 25. You can imagine how the number of different Y chromosomes that can be recognized by the VNTR system increases very quickly, so that, with ten markers each with five different lengths, the number of combinations is 510 = 9.7 million, and with twenty it reaches a staggering 520 = 94 trillion (where a trillion is a million million). As this figure greatly exceeds the world’s entire male population of roughly 3.5 billion, it is plain to see that not all possible Y chromosome combinations actually exist. With this amount of variation available the scene was set for the revolution in genetic genealogy of the last decade.

  6

  The Genetic Genealogy Revolution

  Berry Pomeroy Castle, Devon, England.

  The speed and enthusiasm with which the American genealogy community has embraced genetics has been truly astounding. I vividly remember addressing a meeting of the New England Historical Genealogy Society, America’s oldest, in 2001 when only a tiny minority in the audience knew much about DNA and hardly anyone had heard of mitochondria or Y chromosomes. Now knowledge is detailed and extensive, and genuine advances are being made, fueled by the curiosity of members of the public about their roots rather than by academics.

  The very first scientific paper to feature Y chromosomes in any sort of genealogical connection concerned a man who has already made an appearance in this book, and will do so again. He is Thomas Jefferson, third president of the United States and principal author of the Declaration of Independence. Jefferson’s wife, Martha, died in 1782 following the birth of her sixth child, Lucy Elizabeth. Some years later Jefferson and one of his slaves, Sally Hemings, who was the half-sister of his late wife, became lovers and he may have fathered a further six children by her including her last son, Thomas Eston, born in 1808. The story is a fascinating one for many reasons, not least because, given his political prominence and the unforgiving nature of the times, Jefferson denied it. The scandal, for that was what it was, rumbled on for the next two hundred years until 1998, when genetic evidence proved, beyond reasonable doubt, that it was true. The study that clinched the verdict compared the combinations of markers in the Y-chromosome signature from a direct patrilineal descendant of Thomas Eston Hemings with equivalent relatives of the president (Figure 1).1 Thomas Jefferson did not have any surviving legitimate sons, so his Y chromosome had to be identified through a patrilineal relative, who in this case was the president’s paternal uncle Field Jefferson. Both Field and Peter Jefferson, the president’s father, had inherited the same Y chromosome from t
heir father, Thomas. Therefore any patrilineal descendant of Field Jefferson would carry the same Y chromosome as the president. When the Y chromosomes of five such descendants of Field Jefferson (A–E) were compared with a direct descendant of Thomas Eston Hemings (F), they matched exactly. Moreover, the precise Y-chromosome signature was rare in the general population, making the match extremely significant. This conclusion has not gone down well with some Jefferson descendants, but it was the result of a decisive piece of work.

  Figure 1. Patrilinear relatives and descendants of President Thomas Jefferson with matching Y-chromosome signatures. Lengths of vertical links are approximately proportional to the number of generations.

  Although the Jefferson/Hemings case demonstrated the power of genetics to prove a genuine patrilineal connection, it did not make a general case for the use of the Y chromosome to follow surnames. In fact, quite the opposite. Because Thomas Eston Hemings was illegitimate, the son of a slave and her master, and carried his mother’s name rather than the president’s, it was a prime example of what geneticists call a nonpaternity event, where a surname does not follow the same line of descent as a Y chromosome. This could be because of illegitimacy, as here; adoption; a deliberate name change; or infidelity by the mother. Whenever this happens, the link between a surname and its Y chromosome is broken forever. It was the disruptive effect of nonpaternity on surname/Y-chromosome associations that persuaded those few geneticists who thought about such things at the time that the Y chromosome was unlikely to prove useful on a larger scale. I think we were swayed by the generally high rate of nonpaternity, sometimes as high as 10 percent, uncovered by conventional genetic fingerprinting among the modern population, whereas it was the historical rate that was more relevant. Only when I was curious to test another man with my surname, Sir Richard Sykes (at the time the chairman of the pharmaceutical giant Glaxo), to see if we were related, did the surprisingly high general correlation between surnames and Y chromosomes in men start to come to light.

 

‹ Prev