The symptoms of ASD are consistent with the idea that sufferers of the disorder are unable to incorporate other people’s prioritisation or ‘ranking’ of values, perspectives, intentions, social roles and knowledge structures into the construction of their psyche. Thus they struggle to grasp the social implications of their apperceptions. But they show even more abnormalities, including certain bodily experiences. One possibility is that some of the physical aversions and preferences feed similar experiences in their minds. And ASD sufferers intensely dislike intrusions into their mental lives. Things such as sounds, touching, play, disorder that they cannot fix and phatic language distress sufferers. But, of course, these are crucial to constructing the articulated unconscious that is vital for one’s appreciation for and place in the culture of our community.
Phatic language difficulties are particularly revealing, and include things like leave-takings (‘Goodbye’, ‘Adios’, ‘Hasta la vista, baby’ and ‘Bye-bye’), greetings (‘What’s up!’, ‘Dude, what is it?’, ‘Hello!’), gratitude expressions (‘Thanks’, ‘Oh wow, I can never repay you!’) and so on. Phatic language has long been analysed by linguists and anthropologists as a form of ‘grooming’, that is a recognition of the other as a person you value, however superficially. Interestingly, although English – which has phatic language – lacks grooming as a regular cultural ritual across most of its communities, Pirahã, an Amazonian language, has daily grooming, in which men and women sit in lines and look for lice in each other’s hair (or simply stroke each other’s hair), yet it lacks phatic vocabulary. In other words, perhaps all cultures develop mechanisms, linguistic or otherwise, to demonstrate belonging and care to one another as members of the same community. ASD sufferers often seem to lack this capacity of mutual ‘grooming’ behaviours that establish that two or more individuals ‘belong’ together and accept one another.
Put in other ways, the ASD sufferer is often unable to build a theory either of their enveloping culture or of the intentions of those they interact with. Equally significantly, they are unable to develop a full range of cultural knowledge, values and appreciation of social roles in order to construct a social identity.
The isolation and dysfunctionality of ASD sufferers might be argued to arise from, in part, their inability to converse normally with others, itself resulting from a social inability to feel ‘mutual belonging’ and share ‘mutual ideas’. To converse normally requires and helps to construct a reasonable command of the grammar of the language in which the conversation is taking place, an understanding of the context of the conversation, the purpose of the conversation, the mental state of the interlocutor(s), cultural background knowledge and general world knowledge. These different abilities and forms of knowledge all boil down to recognising culture and working to belong to one. These seem to be the major ethnolinguistic difficulties faced by people with ASD.
Not only is conversation the apex of language, but engaging in conversations has been shown by many researchers to build cultural connections and knowledge while actually constructing the very grammar that is necessary for the conversation. In other words, as we have seen many times, language is synergistic, two or more components of or applications of language mutually enabling one another. In this sense, the very existence of ASD underscores the importance of the language–culture connection.
This discussion of the failure of sufferers of autism to construct either a cultural role or robust cultural understanding raises another proposal that has become popular among some in the discussion of language evolution. This is what is referred to as ‘niche construction’ theory and is the idea that humans fuse part of the environment, say infancy and conversation, to create particular bio-cultural and psychological niches that enrich cognitive and linguistic development, such that people are able to construct larger and larger niches. Perhaps, one might reason, niche construction is the key to ASD – the child fails to construct the proper ‘niche’. In a sense this is correct. But another special theory is unnecessary. ASD, so far as it relates to the connection between culture and the mind, is fully accounted for by what has already been proposed, without need for further theories.
Niche theory is an elaborate mechanism to explain the development of the child and the species by focusing on the construction of relationships of individuals via conversational interaction. There is much to commend the work on niche construction. Yet at the same time there are three problems that lead me to believe that it isn’t very helpful to either the modelling of individual human language development or maturation or to the understanding of human language evolution. First, much of niche construction is already accounted for by what psychologists refer to as ‘attachment theory’, which aims to determine if there are principles that govern the growth in relationship between infants and caregivers and, if so, what those principles are and whether these principles are the same across cultures. (It appears that they might not be.)
The second reason one might not take niche theory to be an integral component of the story of language is that it is largely a metaphor for fairly well-understood phases of developmental psychology. And, finally, this is a theory that makes the wrong predictions about human language evolution, concluding that human language appeared only about 100,000 years ago, a conclusion claimed here to be incorrect.
This matter arises in various aspects of the invention of language, such as in the fact that language is ‘just good enough’. Far from being a perfect biological system of any sort, language just gets by, often failing to communicate as well as one might imagine, its hearers employing general facts of the environmental context and world knowledge to interpret what is being said. And this echoes the strategies of aphasics and Homo erectus. Context and general knowledge are crucial to figure out the meanings of what people are hearing and in order to understand how to respond in the course of the conversation. For ASD sufferers, however, language is not even ‘good enough’. It is deficient, because for these people their language is unable to link up with the social knowledge crucial to the major function of language, namely communication.
A startling conclusion emerges from deficits affecting language: There are no language-only hereditary disorders. And the reason for that is predicted by the theory of language evolution here – namely that there could not be such a deficit because there is no language-specific part of the brain. Language is an invention. The brain is no more specialised for language than for toolmaking, though over time both have affected the development of the brain in general ways that make it more supportive of these tasks.
Language disorders are a window into the human brain and its preparedness for language. But language is more than the brain. It is a function of the entire body, including those components from the lungs to the mouth that make oral speech possible. Although we know that language and speech are distinct, and that there are various kinds of ‘speech’ – visual speech, sign languages and vocal speech – the primary form of speech in all the world’s languages is oral. So the question we must answer now that we understand the brain a bit better is how did evolution prepare us to vocalise our languages.
* One of the best descriptions of the disaster of aphasia is discussed in The Man Who Lost His Language, an outstanding book by Sheila Hale, about her husband John Hale’s stroke. John was a famous historian, knighted for his contributions to British scholarship, Chair of the Trustees of the National Gallery and author of brilliant books, including The Civilization of Europe in the Renaissance. John’s stroke-induced aphasia in effect transformed him back into the languageless infant he was at the beginning of his life. The tragic story (and indictment of the British National Health Service) is nevertheless a brilliant, touching, insightful account of this terrible deficit.
† This is reminiscent of what linguist Derek Bickerton refers to as ‘protolanguage’. As I have remarked many times, however, I reject Bickerton’s term here because it suggests (at least to me) that erectus’s language was known not to be a
fully human language, whereas I consider erectus’s language fully developed, not merely a precursor to modern language.
‡ Paraphrased from the National Institutes of Health website www.nichd.nih.gov/health/topics/autism/conditioninfo/Pages/symptoms.aspx
8
Talking with Tongues
… if the play makes the public aware that there are such people as phoneticians, and that they are among the most important people in England at present, it will serve its turn.
George Bernard Shaw, Preface to Pygmalion
In 1964 my eighth-grade school marching band won a local competition in the Imperial Valley of Southern California. I played baritone horn and was an enthusiastic member. And I knew that, by winning this contest, we would now be able to go to the higher regional competition in Los Angeles, about 130 miles (210 kilometres) to the northwest of our little town of Holtville, near the Mexican border.
Our director wanted to expose the band to some higher culture while in the LA area, so he petitioned the school board to allow us to attend a showing of Mozart’s opera Don Giovanni. The school board said no. Too risqué for junior high schoolers. Instead, we were allowed to attend the band director’s second choice, a showing at the Egyptian Theater in Hollywood of My Fair Lady, starring Rex Harrison and Audrey Hepburn. The band instructor prepared us by talking about the play by George Bernard Shaw, the source for the movie.
This film eventually played a role in my decision to become a linguist, as it revolved around the transformative power of human speech, told from the perspective of Henry Higgins and his reluctant pupil Eliza Doolittle. What is this thing called speech that all humans possess, that George Bernard Shaw believed to be the key to success in life? In The Kingdom of Speech Tom Wolfe claims that speech is the most important invention in the history of the world. It not only enables us to speak to one another but also immediately classifies by economic class, age group and educational attainment. If erectus were around today, would people consider them brutish because of the way they spoke, even if one could so dress them as to pass them off as a peculiar-looking modern human?
Although communication is ancient, human speech is evolutionarily recent. Cognitive scientist and phonetician Philip Lieberman claims that the speech apparatus of modern Homo sapiens is only about 50,000 years old, so recent that even earlier Homo sapiens could not speak as we do today.1 This is not to be confused, though, with the 50,000 year date proposed by other authors on the appearance of language. Speech followed language. Therefore, Lieberman’s 50,000 year date would be, if correct, evidence against the idea that language proper appeared suddenly 50,000 years ago. If erectus did indeed invent symbols and begin humanity’s upward trek through the progression of signs to language, enhanced speech would have come later. It is expected that the first languages would have been inferior to our present languages. No invention begins at the top. All human inventions get better over time. And yet this does not mean that erectus spoke a subhuman language. What it does mean, however, is that they lacked fully modern speech, for physiological reasons, and that their information flow was slower – they didn’t have as much to talk about as we do today, nor do they seem to have had sufficient brain power to process and produce information as quickly as modern sapiens. Erectus’s physiological shortcoming was overcome by gradual biological evolution. The development of information processing and enhanced grammatical ability results from cultural evolution. Both biological and cultural evolution over 60,000 subsequent generations of humans improved our linguistic abilities dramatically.
In a 2016 paper in the publication of the American Association for the Advancement of Science, Tecumseh Fitch and his colleagues argue in effect that Lieberman is mistaken in his view of the evolution of the human vocal tract. They claim instead that the vocal apparatus is much older than the fifty millennia proposed by Lieberman – so old, in fact, that it is found in macaque monkeys.2 While the study by Fitch and colleagues is intriguing, there are two reasons why it is not particularly useful in understanding language evolution. First, most of the extreme tongue positions that they claim to be similar between macaques and humans come from macaque’s yawning. The assumption of Fitch and his co-authors seems to be that, if they can get macaques to place their tongues in the right positions to produce certain human vowels while yawning, the macaques could repeat this if they were speaking. This is a doubtful assumption, though, because yawning is not nearly so effortless a gesture as the production of a back vowel (similar in the shape of the tongue for yawning) is for humans. The tongue is retracted in effortful ways, and it is doubtful speech sounds would evolve out of a yawning vocal tract shape. Another problem with this study is that the authors compare the phonetics of the macaque against archival human phonetics. But the authors should have retested human phonetic properties using the very same methods they used on macaques, in order to compare them more equally.3 Finally and most importantly, is the fact that language does not require speech as we know it. Languages can be whistled, hummed, or spoken with a single vowel with or without a consonant. It is the confluence of culture and the Homo brain that gives us language. Our modern speech is a nice, functional add-on.
On the surface of it, human speech is simple. Vowels and consonants are created along the same principles as the notes formed by wind blowing through a clarinet. The root of both is basic physics. Air flows up from the lungs and out the mouth and is modified as it passes through either the tube of the clarinet or the tube of the human vocal apparatus. In the case of the clarinet the airflow is transformed by keys and a reed that alter it so that it can make the sublime sounds Benny Goodman produced, or the squeaking and squawking of a beginner. Before reaching the mouth, the flow of air is transformed into speech by the larynx, the tongue, the teeth and the different shapes and movements of all the stuff in our throats, noses and mouths that lie above the larynx.
But speech is more complex than a mere wind-tube effect. That is because the tube of human speech is controlled by a complex respiratory physiology directed by an even more complex brain. The creation of speech requires precise control of more than one hundred muscles of the larynx, the respiratory muscles, the diaphragm and the muscles between our ribs – our ‘intercostal’ muscles – and muscles of our mouth and face – our orofacial muscles. The muscle movements required of all these parts during speech is mind-bogglingly complex. The ability to make these movements required that evolution change the structures of the brain and the physiology of the human respiratory apparatus. On the other hand, none of these subsequent adaptations was required for language. They all simply made speech the highly efficient form of transmitting language that we know it as today. Still, it is unlikely that any erectus woman could have played Eliza. Her appearance would never have fooled anyone.
There are three basic parts to human speech capabilities that evolution needed to provide us with to enable us to talk and sing as humans do today. These are the lower respiratory tract, which includes the lungs, heart, diaphragm and intercostal muscles, the upper respiratory tract, which includes the larynx, the pharynx, the nasopharynx, the oropharynx, the tongue, the roof of the mouth, the palate, the lips, the teeth and, most importantly by far, the brain.
The average human produces 135–185 words per minute. Two things about this are deeply impressive. First, it is amazing that humans can talk that fast and consider it normal. Second, it is nearly incredible that people can understand anyone speaking that fast. But, of course, humans do both of these, producing and perceiving speech, without the slightest effort when they are healthy. These are the two sides to speech – production (speaking or signing) and perception (hearing with understanding). To grasp how speech production and perception evolved one has to know not only how the upper and lower respiratory tracts evolved, but also how the brain is able to control the physical components of speech so well and so quickly.
To tell the story of speech, we need to look at the vocal apparatus and the evidence for speech cap
abilities across the various species of Homo. It is important to have a clear idea of how sounds are made, how sounds are perceived and how the brain is able to manage all of this. But prior to that, it is essential to comprehend the state of human speech today. How does speech work with modern Homo sapiens? Knowing the answer to such questions can make it possible to judge how effective the speech of other Homo species would have been relative to sapiens’, and if they were, in fact, capable of speech.
Speech comes out of mouths, travels through the air and enters the ears of hearers, to be interpreted by their brains. Each of the three steps in the creation and transmission and understanding of speech has an entire subfield of phonetics, the science of sounds, dedicated to it. The creation of sounds is the domain of the field of ‘articulatory phonetics’. The transmission of sounds through the air is ‘acoustic phonetics’. And the hearing and interpretation of sounds is ‘auditory phonetics’. But one also encounters names of subfields that are concerned with other types of function. There are studies of the physics and mechanics of speech perception and speech production. These different studies are often grouped together under the name ‘experimental phonetics’. It isn’t necessary to understand all of these to understand the evolution of speech but a wee bit of understanding of them would be helpful.
The larynx is vital to the understanding of the language of Homo species, as it enables humans not only to pronounce human speech sounds, but also to have intonation and use pitch to indicate what aspect of an utterance is new, what is old, what is particularly important, whether people are asking a question, or making a statement. The larynx is where the airflow from the lungs is manipulated in order to produce phonation, the confluence of energy, muscles and airflow required to produce the sounds of human speech.
How Language Began Page 20