Gray’s dates, if correct, are somewhat revolutionary because they show the roots of Indo-European are far older than expected and that language can be traced back far deeper in time than most linguists think likely. Moreover, a reliable dating method would at last allow language change to be correlated with the information emerging from archaeology and population genetics.
Many linguists say Gray’s dates can’t be right, essentially because they conflict with the dates given by linguistic paleontology. But linguistic paleontology is a fuzzy technique, dependent on judgment and vulnerable to undetected borrowing and fallacious reconstructions. Gray’s technique applies a sophisticated statistical method, of proven value in phylogeny, to a reliable data set, the Dyen list, which represents the fruit of Indo-European linguistic scholarship. As a pioneering approach, it may well need refinement, or turn out to have some unexpected flaw. But as compared with linguistic paleontology, it doesn’t seem so obviously less credible.
Gray says he has great respect for the scholarship and methods of historical linguistics and hopes linguists will come around to taking his tree seriously, once they understand that his technique avoids the much discussed errors of glottochronology.
Using a simpler phylogenetic technique, Peter Forster, an archaeologist at the University of Cambridge, has drawn up a family tree of several Celtic languages including Gaulish, the version spoken in ancient France before the Roman conquest, as well as Welsh, Breton and Gaelic. Celtic is a major branch of Indo-European. Forster’s tree implies that Indo-European had diverged around 10,000 years ago, and that Celtic had split into Gaulish and its British branches by 5,200 years ago.270 These dates have wide margins of error, but are in the same range as Gray’s.
Gray’s date of 8,700 years ago for the first split in the Indo-European language tree lends considerable weight to the Renfrew hypothesis that the invention of agriculture drove the spread of Indo-European.
The implications reach beyond the specific case of Indo-European. Success of the biologists’ tree building methods would mean that languages can be reconstructed back to 9,000 years ago, considerably farther back in time than many linguists have supposed. The prospects for reconstructing even older trees of human languages may not be entirely hopeless.
The Greenberg Synthesis
Gray’s tree building bears on a dispute that has long divided historical linguists. The issue is how best to assess the relationships between today’s languages, given that language changes so fast. The world’s 6,000 living languages lie at the tips of a long-vanished tree. Can that tree be reconstructed for other families besides Indo-European? Can these families be grouped into superfamilies so as to reach time depths even deeper than that of Indo-European?
The classification of languages is a matter of considerable disagreement. Many linguists, being familiar with the extreme mutability of language, are skeptical of attempts to find ancient relationships between living tongues. Languages change so fast, they believe, that the number of words two diverging languages may share because of a true cognate relationship quickly dwindles to near zero. Indeed, the number of cognates may fall to the same level as the number of word resemblances that arise purely by chance. Unless that point is recognized, the incautious researcher may assert relationships where none exist. The only acceptable way of avoiding such traps, many linguists believe, is with an approach called the comparative method. The comparative method is highly reliable. Its drawback is that it is so rigorous that it does not reach very far back in time.
Other linguists believe that the comparative method is useful for confirming a postulated relationship between two languages, but is too strict to help detect such relationships in the first place. A leading figure in this school of thought has been the late Joseph H. Greenberg of Stanford University. During his lifetime Greenberg classified almost all of the world’s languages, showing how they could be grouped into some 14 superfamilies.
The superfamily classifications achieved by Greenberg and his colleague Merritt Ruhlen have been greeted warmly by geneticists because these groupings of languages largely mesh with the population splits inferred from gene-based genealogies. But many linguists repudiate Greenberg’s language families, arguing that his method is unreliable and that his work contains errors.
FIGURE 10.3. THE WORLD’S LANGUAGE SUPERFAMILIES.
The language superfamilies of the Old World, as defined by Joseph Greenberg and Merritt Ruhlen. The Basque and Burushaski languages, shown by arrows 1 and 2, are entirely unrelated to their neighbors and may be relicts of more ancient languages. Ket (arrow 3) may be the mother tongue of the Na-Dene languages of North America.
Greenberg was not formally an outsider to the linguistic establishment. He served as president of the Linguistic Society of America and was one of the few linguists to have been elected to the National Academy of Sciences. Aside from his work on classification, he founded a subfield of linguistics known as typology, to do with universal patterns of order in the grammatical elements of language. His 1962 article on typology is said to be the most widely cited in the history of linguistics.
Greenberg’s training, however, was not in linguistics but social anthropology. He did fieldwork studying the ethnography of pagan cults among the Hausa-speaking people of west Africa, spent the years from 1940 to 1945 in the Army Signal Intelligence Corps, mostly decrypting Italian code, and after the war turned his attention to the interrelationship of African languages.
These had been largely the purview of English and French linguists who had classified them with the help of various criteria, like the physical type of the speakers, that Greenberg deemed irrelevant to language origins. He developed his own, purely linguistic method, which he later called mass comparison. It was based on comparing grammar and some 300 items of vocabulary, such as pronouns and words for parts of the body, that as Swadesh had found are less prone to linguistic change. Greenberg would fill notebooks with lists of languages down the left column and word meanings along the top, and simply search in his mind’s eye for relationships.
He started out with Hausa, trying to see what other languages it might be related to by comparing common words and deciding if the languages fell into groups. Over the space of 5 years, Greenberg kept arranging the 1,500 then known languages of Africa into larger and larger assemblies, until he had grouped them into just 16 superfamilies, and finally only four. He put the odd and ancient click languages of southern Africa into the group named Khoisan. The languages of central Africa, including the widely spoken Bantu languages, he assigned to a group he called Niger-Kordofanian. He decided that the Bantu languages must have originated in west Africa, because that is where their diversity is greatest. From that it followed that the present-day Bantu languages, which are distributed down the west and east coasts of Africa, must have arisen from a migration out of the homeland that had split into two streams, one going directly down the west coast, the other crossing the breadth of Africa and then turning south down the east coast. This inference was later confirmed by archaeologists.
Greenberg’s third group was Nilo-Saharan, a family of languages spoken by Nilotic peoples like the Nuer and the Dinka as well as by people of the Saharan region and by the Songhay of west Africa. The fourth group of languages, spoken in a swath across northern Africa, he named Afroasiatic. This family includes Berber of northwestern Africa, ancient Egyptian, and Semitic, a branch to which belong Arabic, Hebrew and Akkadian, the extinct language of the Assyrians and Babylonians.
Greenberg’s sweeping classification of African languages has stood the test of time and is broadly accepted, although scholars continue to rearrange the furniture. The African languages are of particular interest because of their diversity and presumed antiquity. At the latest count some 2,035 are now known, of which 35 belong to the Khoisan family, 1,436 to Niger-Congo (a new name for Greenberg’s Niger-Kordofanian), 196 to Nilo-Saharan and 371 to Afroasiatic.271
Ehret has attempted to date the period when
the proto-languages of Greenberg’s four groups were spoken. On archaeological evidence, he estimates that proto-Khoisan was first spoken about 20,000 years ago. The ancestral tongue of the Niger-Congo family may date back to 15,000 years ago, since a junior branch of the family had spread across the yam growing regions of west Africa from 8,000 years ago. Proto-Nilo-Saharan, on the basis of glottochronology, may be 12,000 years old.272
Afroasiatic is a language family of general interest since its West Semitic branch includes Hebrew, Aramaic and Arabic, the founding languages of three popular religions. Many people have assumed the ancestral homeland of proto-Afroasiatic was in the Near East, some for a miscellany of unscientific reasons, others because the Near East is a known center of early agriculture from which growing populations might have expanded into Africa, carrying their language with them. But an African origin seems more likely, in Ehret’s view. Of the six major branches of Afroasiatic, five lie in Africa—Berber in northwest Africa, Chadic around Lake Chad at the southern edge of the central Sahara, Cushitic in the Horn of Africa, Omotic in the Ethiopian highlands, and ancient Egyptian.
Following the rule that the region of greatest diversity is usually the homeland, this distribution points strongly to an ancestral homeland for Afroasiatic somewhere in northern Africa, which the Semitic speakers left to invade the Near East, perhaps some 9,000 years ago.273 (Later, about 7,000 years ago, some crossed back from Yemen into Ethiopia, giving it the country’s principal language of Amharic.) Also pointing to an African homeland, the earliest branching of proto-Afroasiatic was into Omotic and the rest, and the second branching was into Cushitic and the rest. Since Omotic and Cushitic are both restricted to Africa, that has “put it beyond doubt that the ancestral language, proto-Afroasiatic, was spoken in Africa,” writes Ehret.274
FIGURE 10.4. THE AFROASIATIC LANGUAGE FAMILY.
The major branches of the Afroasiatic language family. Arabic is now spoken in the area shown as belonging to Ancient Egyptian.
Though Greenberg’s classification of African languages is now broadly accepted, it was for many years bitterly resisted by British Africanists. In linguistics as in other academic fields, specialists tend to resent the generalist who shows how their little patch relates to a larger order. Paul Newman, a linguist at Indiana University, recalls visiting the London School of Oriental and African Studies around 1970, some 15 years after the first publication of Greenberg’s African work. He was told that it was quite safe for him to go into the common room, as long as he did not mention Greenberg’s name.275
After his African classification, Greenberg turned his attention to the question of American Indian languages. Taking note of the archaeological findings that the Americas had been settled only recently, Greenberg expected to find far fewer language families than in Africa. But American linguists, then undergoing a splittist phase, had agreed at a conference in 1976 that no fewer than 63 independent language families were spoken in the Americas. Greenberg, using the same mass comparison method he had developed for Africa, announced there were just three—Amerind, Na-Dene and Eskimo-Aleut.276
Greenberg’s conclusions induced the same agitation among American linguists as his African classification had among the British. And even though American linguists had generally accepted his grouping of African languages, they now assailed him with a fury that startled the population geneticists who were beginning to take an interest in his work. Luca Cavalli-Sforza, an eminent geneticist at Stanford University, wrote of his dismay at the linguists’ diatribes against Greenberg.277
Cavalli-Sforza’s confidence in Greenberg’s approach stemmed from the fact that, at least in general outline, he had confirmed it by an independent approach. Before methods of DNA analysis became available, Cavalli-Sforza and colleagues had worked out a genetic family tree of the world’s populations in terms of protein differences. Comparing this tree to Greenberg’s list of major language families, Cavalli-Sforza showed that peoples who were grouped together on his world population tree tended to fall into the same language family, as defined by Greenberg.278 Further analysis proved that the correspondence between the world’s human population tree and Greenberg’s language families was statistically significant.279
The Comparative Method versus Mass Comparison
Despite Cavalli-Sforza’s support for Greenberg’s findings, linguists continued to assail Greenberg’s work on grounds of factual errors and methodology. As even Greenberg’s supporters concede, he was interested in the big picture, not the details. Numerous small errors, of the type scholars usually do their best to avoid, crept into his work. Some were errors of transcription, some perhaps the result of working in haste as he reviewed the grammar and vocabulary of hundreds of languages, transcribing everything with his own hand and usually without a graduate student to check things. Were the errors fatal, as his Americanist critics contended, or trivial, as his supporters averred? The verdict of the Africanists, who came to agree with him, is that the errors were not significant. “There are . . . more errors in data-entering than one expects in such a work,” writes Lionel Bender, an Africanist at Southern Illinois University, about Greenberg’s book on African languages. “Nevertheless, he got it right for the most part and his African classification culminating in the 1963 book is a tremendous advance.”280
The larger point of Greenberg’s critics was that in establishing relationships among languages he had failed to use what is known as the comparative method, the orthodox approach to classifying languages. The method is based on identifying sets of related words that change in predictable ways between members of a language family. The French and Italian words for “goat” are not particularly similar, but when compared with other words it is clear that a “k” sound in Italian corresponds with a “ch” sound in French, and a “p” in Italian corresponds with an “f” or “v” in French.281 These sound correspondences exist because many French and Italian words are cognates, or descendants of the same parent word in their common ancestor tongue of Latin.
Once the rules of sound correspondence between contemporary languages have been established, the word in the parent language can be reconstructed. Scholars have reconstructed an extensive vocabulary in proto-Indo-European, the hypothesized ancestral tongue of many European and Indian languages. Any claim that a language is part of the Indo-European family can then be tested by seeing if its grammar and vocabulary can be derived, by the established rules, from proto-Indo-European. From the instances above, English might not seem so promising a candidate, but the initial “k” sound in Latin is known to correspond with an “h” in the Germanic group of languages, making head and the German word haupt (now a figurative word for head) cognates with Latin’s caput. By the same rule Latin’s canis is cognate with German’s hund and the English word hound, all being derived from proto-Indo-European *kwon.
Rigorous application of the comparative method has freed linguistics from many false etymologies and crank theories. Many linguists insist that the comparative method is the only acceptable way of testing whether languages are related to each other. This position is based on the belief that, since words change so fast, two daughter languages will soon have only a small percentage of their vocabulary in common and at this point the number of true cognates may be exceeded by chance resemblances and words that sound alike because the two languages under comparison each borrowed them from a third.
Because the signal of the true cognates is soon overwhelmed by the noise of specious ones, the roots of a family of languages, linguists say, can be traced no farther back than about 6,000 years or so, the period when most linguists believe proto-Indo-European was spoken.
Greenberg, in his method of mass comparison, did not look for sound correspondences, nor did he try to reconstruct proto-languages to confirm his findings. Hence, in the view of many linguists, his method and findings cannot be trusted.
Whatever the theoretical objections to Greenberg’s method, the bottom line is the empirical question
of whether or not it works. Africanists have decided it did indeed work for African languages. But this apparently persuasive circumstance has not changed linguists’ views about the validity of Greenberg’s method. In a recent essay on Greenberg’s Afroasiatic family, Richard Hayward, of the London School of Oriental and African Studies, writes that the “only admissible evidence” for establishing that languages have a common ancestry is by the comparative method and sound correspondences. “Now it was on the basis of ‘mass comparison,’ rather than the comparative method, that the canon of the Afroasiatic languages was established by Greenberg, and although this methodology . . . has, in the present writer’s view, come up with the right conclusions, a methodology that does not invoke the rigour of the principle stated in the last paragraph [i.e., that of the comparative method] cannot make predictions, and so falls short of true theoretical status,” Hayward writes.282 In other words, even if Greenberg got the right answer, it was by the wrong method.
If the faculty of human language were extremely ancient, and if human populations were highly mixed, the likelihood of languages on the same continent being related to each other might be small, and it would be appropriate to assume languages were unrelated unless proven otherwise. But since fully modern language probably evolved only 50,000 years ago, and since today’s populations still strongly reflect the original patterns of human migration, the reverse is the case: all languages are probably offshoots of a single mother tongue and related to each other at one level or another. In circumstances where history and archaeology make language relationships very likely, such as in the Americas, a lesser standard of proof would perhaps be appropriate. It is surely in Africa, where languages have had longest to diversify, that Greenberg’s mass comparison method stood least chance of success, yet it is there that linguists judge it to be most successful.
Before the Dawn: Recovering the Lost History of Our Ancestors Page 27