Very recently, another method has been proposed that promises to push our knowledge a little further back in time. Instead of comparing words, this technique uses computer-based strategies from evolutionary biology and compares syntactic structures, such as the ordering of verbs and subjects in a sentence. Researchers at the Max Planck Institute for Psycholinguistics in the Netherlands first tested it against a family of Austronesian languages whose relationships had already been uncovered by the comparative method. They came up with the same results. They then looked at fifteen Papuan languages that were not known to be related. Most of the words in the Papuan set looked unrelated, so the comparative method was of no help in determining whether any relationships existed. Using the evolutionary biology method, the researchers were able to reveal relationships between the languages that, crucially, were consistent with their geographical distribution. They suspect that these languages trace back to a common ancestor that was spoken more than ten thousand years ago.
Jackendoff has developed an approach for recovering ancient elements of language that would take us even further back into the past. He believes that language itself carries fossils of earlier forms, allowing us to reverse-engineer it back to an evolutionarily simpler state. Jackendoff was inspired by Derek Bickerton, one of the first linguists to develop the concept that before our current form of language, we must have communicated with a protolanguage, a simpler step on the way to modern words and syntax.
Jackendoff said: “The idea behind it is that there was this stage of ‘protolanguage’ preceding the stage of modern language. The logic behind figuring out protolanguage is that we can find aspects of modern language that could have served as an effective communication system without the rest of language. What Bickerton’s version of protolanguage has that no other animal communication system has is some kind of phonology, so you can build a large vocabulary. In addition it has the symbolic use of words, and it concatenates words to convey meanings that combine the meanings of individual words. What it doesn’t have to have is modern syntax.”
Even to achieve this level of protolanguage, you must have two or three very important innovations in place. “The construction-based view of language,” Jackendoff explained, “makes it natural to conceive of syntax as having evolved subsequent to two other important aspects of language: the symbolic use of utterances and the evolution of phonological structure as a way of digitizing words for reliability and massive expansion of vocabulary.” Once you have that, the rest can follow.
In his book The Symbolic Species, Terrence Deacon proposes that various platforms of understanding are necessary for an animal to use utterances symbolically. He invokes three types of reference described by Charles Sanders Peirce—iconic reference, indexical reference, and symbolic reference. Crucially, these distinctions are not inherent to any object or event in the world, but rather are descriptions of the kinds of interpretations that can be made about objects or events. Icons (or making an iconic interpretation) are the simplest type of reference. If an object is iconic, there is a similarity between it and something else. Landscape paintings, Deacon points out, are iconic of the landscape they depict. Indexes are a step more complicated than icons because they are built from iconic relationships. With an indexical interpretation, there is some kind of correlation, often causal, between an object or an event and something else. A skunk smell may indicate that a skunk is nearby. “Most forms of animal communication,” writes Deacon, “have this quality, from pheromonal odors (that indicate an animal’s physiological state or proximity) to alarm calls (that indicate the presence of a dangerous predator).”11 A symbol, in turn, is more complicated than an index, because it involves some kind of convention or system that guides the way we link one thing to another. A wedding ring is a symbol of marriage, writes Deacon, just as e is a symbol of a sound that we use in speech.
Complicated reference is thus created by layering simpler forms of reference together. Much animal communication makes extensive use of iconic and indexical reference, but only human language is rooted in the unusual and complicated relationships that exist with symbolic reference. The jump to symbolic reference from indexical reference is not straightforward, argues Deacon. Symbols do not exist by themselves; they exist only in the context of other symbols and, crucially, in the relationships between them. Making a symbolic interpretation involves simultaneously understanding where a symbol, be it a wedding ring or a word, exists with respect to other symbols in its set (in the case of a word, knowing which other words it can be combined with and which it can’t) and understanding the way it refers to objects or meaning in the world. Only because we understand symbolic reference and the ways that words must be combined with other words (which is to say that words are by their very nature syntactic) can we create modern human language with all its various structural possibilities. Even though symbolic reference is highly unusual in the nonhuman world, it’s not impossible for some animals to comprehend it. Kanzi is a good example of an animal that has been bootstrapped into this sophisticated form of understanding.
In the same way that Deacon lays out the progression of meaning as a logically layered hierarchy, Jackendoff proposes other elements that must have necessarily followed one another in the increasing elaboration of human language over time. He suggests that the next stage after the ability to use symbols was reached must have been the employment of symbols in nonspecific situations. Words like “damn it,” “ouch,” and “wow” have no syntax at all and are, he says, fossils of this stage. Remnants of the next stage—where symbols are a little more bound by syntax and a little more tied to context—are “ssh,” “psst,” “hello,” “goodbye,” “yes,” and “no.” (No animal communication system has an equivalent for “no,” argues Jackendoff, but it tends to be one of the first words in any child’s repertoire.)
After these early stages comes the development of a large set of symbols that is, in principle, unlimited. Next is the ability to place these items together in meaningful ways. Syllables and phonemes must have come next, preparing the way for the development of a simple protolanguage. Then, along with the appearance of grammatical categories and inflections, there must have been a proliferation of symbols for encoding abstract meaning, like words for space relationships (“up,” “to,” “behind”) and time (“Tuesday,” “now,” “before”).12 All these layers were required to build modern language, with its complicated syntax and special linguistic meanings, and all must have been in place before the first ancient human language split into the branches that lead to the many modern languages.
“It’s totally hypothetical, this reverse engineering,” said Jackendoff, “but it’s the kind of thing you would investigate across species if you could: You would look for another species that has only a subset of the features, corresponding to an earlier stage. Of course, there aren’t any such species for the deeply combinatorial aspects of language. But if we were looking for the evolutionary roots of eyes or toes this is exactly what we would do.”
Another domain in which humans use structure with virtuoso abilities is music, which is, like language, one of the species’ relatively few universal abilities. Without formal training any individual from any culture has the ability to recognize music and, in some fashion, to make it. Why this should be so is a mystery. After all, music isn’t necessary for getting through the day, and if it aids in reproduction, it does so only in highly indirect ways. Scientists have always been intrigued by the connection between music and language. Yet over the years, words and melody have been accorded a vastly different status in the laboratory and the seminar room. While language has long been considered essential to unlocking the mechanisms of human intelligence, music is generally treated as an evolutionary frippery—“auditory cheesecake,” as Steven Pinker put it.
But thanks to a decade-long wave of neuroscientific research, attitudes have been changing. A flurry of recent publications suggests that language and music may equally contribute in tell
ing us who we are and where we have come from. It’s not surprising to find that many of the researchers engaged in the study of language evolution are also drawn to the evolution of music. Ray Jackendoff (who, in addition to being a linguist, was the principal clarinet of the Civic Symphony Orchestra of Boston for twenty years) and colleague Fred Lerdahl have investigated language and music as cognitive phenomena. Breaking music into its major components (rhythm, the structure of melody and harmony, emotion in music), Jackendoff and Lerdahl sought to identify which elements of music arise from general cognitive processes, which come from processes that are common to music and language, and what, if anything, is peculiar to music.
Their investigation of musical affect is most interesting with regard to these three questions. Clearly, some affect in music derives from a broader set of associations. For example, we are startled by sudden, loud noises, and this applies equally to random noise as to sudden, loud musical outbursts. In addition, some affect appears to draw on a shared understanding of language and music. Jackendoff and Lerdahl point out that large structures in music can be like dramatic arcs in narratives. The slow buildup of tension, a climax, and then denouement can be found in both musical pieces and stories. It may be that both music and language exploit a human predisposition to understand events in terms of tension and resolution. Jackendoff and Lerdahl also suggest that the way people convert music into gesture, whether by dance or in conducting an orchestra, is instinctive and special to music alone. Different kinds of music invoke different kinds of movement; a waltz does not inspire people to march, and vice versa. Even very young children show a sensitivity to this aspect of music when they spontaneously dance. While it’s not possible to fully disentangle these aspects of the musical experience from one another, an investigation of the common and unique cognitive bases of music, say the researchers, contributes to its biological profile, which in turn helps track its evolutionary trajectory.
In an article in the Journal of Neuroscience, David Schwartz, Catherine Howe, and Dale Purves of Duke University investigated the question of the language and music relationship in a very different way, concluding that the sounds of music and the sounds of language are intricately connected.13
To grasp the originality of their idea, two things about how music has traditionally been interpreted must be understood. First, musicologists have long emphasized that while each culture stamps a special identity onto its music, music itself has some universal qualities. For example, in virtually all cultures sound is divided into some or all of the twelve intervals that make up the chromatic scale—that is, the scale represented by the keys on a piano. For centuries, observers have attributed this preference for certain combinations of tones to the mathematical properties of sound itself.
Some twenty-five hundred years ago Pythagoras was the first to note a direct relationship between the harmoniousness of a tone combination and the physical dimensions of the object that produced it. For example, a plucked string will always play an octave lower than a similar string half its size, and a fifth lower than a similar string two-thirds its length. This link between simple ratios and harmony has influenced music theory ever since.
Second, this music-is-math idea is often accompanied by the notion that music, formally speaking at least, exists apart from the world in which it was created. Writing in the New York Review of Books, the pianist and critic Charles Rosen discussed the long-standing conviction that while painting and sculpture reproduce at least some aspects of the natural world, and writing describes thoughts and feelings we are all familiar with, music is entirely abstracted from the world in which we live.
Neither idea is correct, according to Schwartz and colleagues. Human musical preferences are fundamentally shaped not by elegant algorithms or ratios but by the messy sounds of real life, and of speech in particular—which in turn is shaped by our evolutionary heritage. Said Schwartz, “The explanation of music, like the explanation of any product of the mind, must be rooted in biology, not in numbers per se.”
Schwartz, Howe, and Purves analyzed a vast selection of speech sounds from a variety of languages to determine the underlying patterns common to all utterances. In order to focus only on the raw sound, they discarded all theories about speech and meaning and sliced sentences into random bites. Using a database of over a hundred thousand brief segments of speech, they noted which frequency had the greatest emphasis in each sound. The resulting set of frequencies, they discovered, corresponded closely to the chromatic scale. In short, the building blocks of music are to be found in speech.
“Music, like the visual arts, is rooted in our experience of the natural world,” said Schwartz. “It emulates our sound environment in the way that visual arts emulate the visual environment.” In music we hear the echo of our basic sound-making instrument—the vocal tract. This explanation for human music is simpler still than Pythagoras’s mathematical equations: we like the sounds that are familiar to us—specifically, we like sounds that remind us of us.
This brings up some chicken-or-egg evolutionary questions. It may be that music imitates speech directly, the researchers say, in which case it would seem that language evolved first. It’s also conceivable that music came first and language is in effect an imitation of song—that in everyday speech we hit the musical notes we especially like. Alternately, it may be that music imitates the general products of the human sound-making system, which just happen to be mostly speech. “We can’t know this,” says Schwartz. “What we do know is that they both come from the same system, and it is this that shapes our preferences.”
Schwartz’s study also casts light on the long-running question of whether animals understand or appreciate music. Despite the apparent abundance of “music” in the natural world—birdsongs, whale songs, wolf howls, synchronized chimpanzee hooting—previous studies have found that many laboratory animals don’t show a great affinity for the human variety of music making. Indeed, Marc Hauser and Josh McDermott of Harvard argued in a special music issue of Nature Neuroscience that animals don’t create or perceive music the way we do.14 The fact that laboratory animals can show recognition of human tunes is evidence, they say, of shared general features of the auditory system, but not of any specific musical ability.
But what’s been played to the animals, Schwartz noted, is human music. If animals have evolved preferences for sound as we have—based on the soundscape in which they live—then their “music” would be fundamentally different from ours. In the same way our scales derive from human utterances, a cat’s idea of a good tune would derive from yowls and meows. To demonstrate that animals don’t appreciate sounds the way we do, we’d need evidence that they don’t respond to “music” constructed from their own sound environment.
Of course, there are many examples of animal music. Fitch (who is also an avid amateur musician, composer, and singer) argues that it is worthwhile to examine these in comparison to human music and language. Fitch examined not just animal song, like birdsong and whale song—which must be learned as we learn to sing and talk—but also examples of animal instrumentation. The best examples of instrument use in nonhuman animals are found in our very close relatives. For dominance displays and in play, chimpanzees drum on trees and other resonant objects, while gorillas drum on their own chests (and occasionally other objects). Sue Savage-Rumbaugh’s bonobos have also demonstrated an appreciation of percussion and keyboard playing (recall that they also use keyboardlike machines for linguistic communication). Instrumental music is rare in vertebrates, except for African apes, which includes us, leading Fitch to suggest that the drumming of chimpanzees and gorillas may be evolutionary homologs to human instrumental music.
Fitch has further explored the antecedents of human instrumentation via the divisive issue of Neanderthal flutes. A number of researchers have examined a fossilized cave-bear bone with two holes (and possibly another three damaged holes), attributed to Neanderthals.15 It has been argued that the object, which is radiocarbon-
dated to approximately 43,000 years ago, is a flute. Although the provenance and nature of this bone are still regarded as controversial, Fitch points out that if it was a flute, it dates the origin of human instrumental music to at least the common ancestor of Neanderthals and humans, Homo heidelbergensis (see chapter 12), who lived more than 500,000 years ago.
No matter how the connection between language and music is parsed, what is apparent is that our sense of music, even our love for it, is as deeply rooted in our biology as language is. The upshot, said the University of Toronto’s Sandra Trehub, who also published a paper in the music issue of Nature Neuroscience, is that music may be “more like a necessity than the pleasure cocktail envisioned by Pinker.”
This is most obvious with babies, said Trehub, for whom music and speech are on a continuum. Mothers use musical speech, called motherese, to “regulate infants’ emotional states,” she explained.16 Regardless of what language they speak, the voice all mothers use with babies is something between speech and song. This kind of communication “puts the baby in a trance-like state, which may proceed to sleep or extended periods of rapture.” This means, explained Trehub, that music may be even more of a necessity than we realize.17
The First Word: The Search for the Origins of Language Page 20