How Language Began Page 23 Read online free by Daniel L. Everett

Home > Other > How Language Began > Page 23

How Language Began Page 23

9

Where Grammar Came From

Speech is a non-instinctive, acquired, ‘cultural’ function.

Edward Sapir

SOMEONE MIGHT ASK this in English: ‘Yesterday, what did John give to Mary in the library?’ And someone else might answer, ‘Catcher in the Rye.’

This is a complete conversation. Not a particularly fecund one, but, still, typical of the exchanges that people depend on in their day-to-day lives, representative of the way that brains are supersized by culture as well as the role of language in expanding knowledge from the brain of a single individual to the shared knowledge of all the individuals of a society, of all the individuals alive, in fact. Even of all the individuals who have ever lived and written or been written about. It wasn’t the computer that ushered in the information age, it was language. The information age began nearly 2 million years ago. Homo sapiens have just tweaked it a bit.

Discourse and conversation are the apex of language. In what way, though, is this apical position revealed in the sentences in the discourse above? When native speakers of English hear the first sentence of our conversation in a natural context, they understand it. They are able to do this because they have learned to listen to all the parts of this complex whole and to use each part to help them understand what the speaker intends when they ask, ‘Yesterday, what did John give to Mary in the library?’

First, they understand the words, ‘library’, ‘in’, ‘did’, ‘John,’ and the rest. Second, all English speakers will hear the word with the greatest amplitude or loudness and also notice which words receive the highest and lowest pitches. The loudness and pitch can vary according to what the speaker is trying to communicate. They aren’t always the same, not even for the same sentence. Figure 25 shows one way to assign pitch and loudness to our sentence.*

Figure 25: Yesterday, what did John give to Mary in the library?

The line above the words shows the melody, the relative pitches, over the entire sentence. The italicised ‘yesterday’ indicates that it is the second-loudest word in the sentence. The small caps of ‘John’ mean that that is the loudest word in the sentence. ‘Yesterday’ and ‘John’ are selected by the speaker to indicate that they warrant the hearer’s particular attention. The melody indicates that this is a question, but it also picks out the four words ‘yesterday’, ‘John’, ‘Mary’ and ‘library’ for higher pitches, indicating that they represent different kinds of information needed to process the request.

The loudness and pitch of ‘yesterday’ says this – one is talking about what someone gave to someone yesterday, not today, not another day. This helps the hearer avoid confusion. The word ‘yesterday’ doesn’t communicate this information by itself. The word is aided by pitch and loudness, which highlight its specialness for both the information being communicated and the information being requested. ‘John’ is the loudest word because it is particularly important to the speaker for the hearer to tell him what ‘John’ did. Maybe Mary is a librarian. People give her books all day, every day. Then what is being asked is not about what Suzy gave her but only what John gave her. The pitch and loudness let the hearer know this so that they do not have to sort through all the people who gave Mary things yesterday. The word ‘John’ already makes this clear. The pitch and loudness highlight this. They offer additional cues to the hearer to guide their mental search for the right information.

Now, when someone asks this question, what does the asker look like? Probably they look something like this: the upper arms are held against their sides, with their forearms extended, palms facing outward and their brow furrowed. These body and hand expressions are important. The hearer uses these gestural, facial and other body cues to know immediately that you are not making a statement. A question is being asked.

Now consider the words themselves. The word ‘yesterday’ appears at the far left of the sentence. In the sentence below the <>’s indicate the other places that ‘yesterday’ could appear in the sentence:

<> what <> did John <> give <> to Mary <> in the library <>?

So we could ask:

‘What did John give yesterday to Mary in the library?’

‘Yesterday what did John give to Mary in the library?’

‘What did John, yesterday, give to Mary in the library?’

‘What did John give to Mary yesterday in the library?’

‘What did John give to Mary in the library yesterday?’

‘What, yesterday, did John give to Mary in the library?’

But a native speaker is less likely to ask the question like this:

*‘What did yesterday John give to Mary in the library?’

*‘What did John give to yesterday Mary in the library?’

*‘What did John give to Mary in yesterday the library?’

*‘What did John give to Mary in the yesterday library?’

The asterisk in the examples just above means that one is perhaps less likely to hear these sentences other than in a very specific context, if at all. This exercise could be continued with other words or phrases of the sample sentence:

‘In the library, what did John give to Mary yesterday?’

This exercise on words and their orders need not be continued, however, since there is now enough information to know that putting together a sentence is not simply stringing words like beads on a cord.

Part of the organisation of the sentence is the grouping of words in most languages into phrases. These phrases should not be interrupted, which is why one cannot place yesterday after the word ‘in’ or after the word ‘the’. Grammatical phrases are forms of ‘chunking’ for short-term memory. They aid recall and interpretation.

Still, a great deal of cultural information is missing from the above example. For instance what in the hell is a library? Is John a man or a woman? Is Mary a man or a woman? Which library is being talked about? What kinds of things are most likely to be given by John? Do John and Mary know each other? Although there are a lot of questions, one quickly perceives the answers to them if they are the speaker or hearer in the above conversation because of the knowledge people absorb – often without instruction – from their surrounding society and culture. People use individual knowledge (such as which library is the most likely candidate) and cultural knowledge (such as what a library is) to narrow down their ‘solution space’ as hearers. They are not, therefore, obligated to sort through all possible information to understand and respond, but must only mentally scroll through the culturally and individually most pertinent information that might fit the question at hand. The syntax, word selection, intonation and amplitude are all designed to help understand what was just said.

But there’s more, such as certain kinds of information in the example here. There is shared information, signalled sometimes by words such as ‘the’ in the phrase ‘the library’. Because someone says ‘the library’ and not ‘a library’, they signal to the hearer that this is a library that both know – they share this knowledge – because of the context in which they are having this conversation. The request here is for new information, as signalled by the question-word ‘what’. This is information that the speaker does not have but expects the hearer to have. Sentences exist to facilitate information exchange between speakers. Grammar is simply a tool to facilitate that.

The question is also an intentional act – an action intended to elicit a particular kind of action from the hearer. The action desired here is ‘give me the information I wish or tell me where to get it’. Actions vary. Thus, if a king were to say, ‘Off with his head,’ the action desired would be a decapitation, if he were being literal. Literalness brings us to another twist on how sentences are uttered and understood – is the speaker being literal or ironic or figurative? Are they insane?

With language, speakers can recognise promises, declarations, indirect requests, direct requests, denunciations, legal impact (‘I now pronounce you man and wife’) and other culturally significant infor
mation points. No theory of language should neglect to tell us about the complexity of language and how its parts fit together – intonation, gestures, grammar, lexical choice, type of intention and the rest. And what does the hearer do in this river of signals and information? Does she sit and ponder for hours before giving an answer? No she understands all of this implicitly and instantly. These cues work together. Taken as one, they make the sentence easier to understand, not harder. And the evidence is that the single strongest force driving this instantaneous comprehension is information structure. What is new? What is shared? And this derives not merely from the literal meanings of the words but from that implicit cultural knowledge that I call dark matter.

Into the syntactic representation of a sentence, speakers intercalate gestures and intonation. They use these gestures as annotations to indicate the presence of implicit information from the culture and the personal experiences of the speaker and hearer. But something is always left out. The language never expresses everything. The culture fills in the details.

How did human languages go from simple symbols to this complex interaction between higher-level symbols, symbols within symbols, grammar, intonation, gesture and culture? And why does all of this vary so much from language to language and culture to culture? Using the same words in British English or Australian English or Indian English or American English will produce related but distinct presupposed knowledge, intonational patterns, gestures and facial expressions. Far from there being any ‘universal grammar’ of integrating the various aspects of a single utterance, each culture largely follows its own head.

There are universally shared aspects of language, of course. Every culture uses pitch, attaches gestures and orders words in some agreed-upon sequence. These are necessary limits and features of language because they reflect physical and mental limitations of the species. Maybe – and this is exciting to think about – some of this represents vestiges of the ways that Homo erectus talked. Maybe humans passed a lot of grammar down by example, from millennium to millennium as the species continued to evolve. It is possible that modern languages have maintained 2-million-year-old solutions to information transfer first invented by Homo erectus. This possibility cannot be dismissed.

Reviewing what we have learned about symbols, these are based on a simple principle – namely that an arbitrary form can represent a meaning. Each symbol also entails Peirce’s interpretant. Signs in all their forms are a first step towards another essential component of human language, the triple patterning of form and meaning via the addition of the interpretative aids of gestures and intonation. As symbols and the rest become more enculturated, they advance up the ladder from communication to language, to a distinction between the perspective of the outsider vs the perspective of the insider, what linguist Kenneth Pike referred to as the ‘etic’ (outsider viewpoint) and the ‘emic’ (insider viewpoint). Signs alone do not get us all the way to the etic and the emic. Culture is crucial all along the way.

The etic perspective is the perspective a tourist might have listening to a foreign language for the first time. ‘They talk too fast.’ ‘I don’t know how they understand one another with all those weird sounds.’ But as one learns to speak the language, the sounds become more familiar, the language doesn’t sound as though it is spoken all that rapidly after all, the language and its rules and pronunciation patterns become familiar. The learner has travelled from the etic perspective of the outsider to the emic perspective of the insider.

By associating meanings with forms to create symbols, the distinction between form and meaning is highlighted.† And because symbols are interpreted by members of a particular group, they lead to the insider interpretation vs the outsider interpretation. This is what makes languages understandable to native speakers, but difficult to learn for non-native speakers. The progression to language is just this: Indexes → Icons → (emic) Symbols + (emic) grammar, (emic) gestures and (emic) intonation.

Following symbols, another important invention for language is grammar. Structure is needed to make more complex utterances out of symbols. A set of organising principles is required. These enable us to form utterances efficiently and most in line with the cultural expectations of the hearers.

Grammars are organised in two ways at once – vertically, also known as paradigmatic organisation, and horizontally, referred to as syntagmatic organisation. These modes of organisation underlie all grammars, as was pointed out at the beginning of the twentieth century by Swiss linguist Ferdinand de Saussure. Both vertical and horizontal organisation of a grammar work together to facilitate communication by allowing more information to be packed into individual words and phrases of language than would be possible without them. These modes of organisation follow from the nature of symbols and the transmission of information.

If one has symbols and sounds then there is no huge mental leap required to put these in some linear order. Linguists call the placing together of meaningless sounds (‘phonemes’ is the name given to speech sounds) into meaningful words ‘duality of patterning’. For example the s, a and t of the word ‘sat’ are meaningless on their own. But assembled in this order, the word they form does have meaning. To form words, phonemes are taken from the list of a given language’s sounds and placed into ‘slots’ to form a word, as, again, in ‘sat’: sslot1aslot2tslot3.

And once this duality becomes conventionalised, agreed upon by the members of a culture, then it can be extended to combine meaningful items with meaningful items. From there, it is not a huge leap to use symbols for events and symbols for things together to make statements. Assume that one has a list of symbols. This is one aspect of the vertical or paradigmatic aspect of the grammar. Next there is an order to place these symbols in that a culture has agreed upon for the organisation of the symbols. So the task in forming a sentence or a phrase is to choose a symbol and place it in a slot, as is illustrated in Figure 26.

Figure 26: Extended duality of patterning – making a sentence

Knowing the grammar, which every speaker must, is just knowing the instructions for assembling the words into sentences. The simple grammar for this made-up language might just be: select one paradigmatic filler and place it in appropriate syntagmatic slot.

From the idea that there is an inventory of symbols to be placed in a specific order, not a huge jump cognitively, early humans would have been using the ideas of ‘slot’ and ‘filler’. These are the bases of all grammars.

All of this was first explained by linguist Charles Hockett in 1960.1 He called the combination of meaningless elements to make meaningful ones ‘duality of patterning’. And once a people have symbols plus duality of patterning, then they extend duality to get the syntagmatic and paradigmatic organisation in the chart above. This almost gets us to human language. Only two other things are necessary – gestures and intonation. These together give the full language – symbols plus gestures and intonation. Here, though, the focus is on duality of patterning. As people organise their symbols they naturally begin next to analyse their symbols into smaller units. Thus a word such as ‘cat’, a symbol, is organised horizontally, or syntagmatically, as a syllable, c-a-t. But with this organisation it also becomes clear that ‘cat’ is organised vertically at the same time. So one could substitute a ‘p’ for the ‘c’ of cat to produce the word pat. Or one could substitute ‘d’ for ‘t’ and get instead cad. In other words, ‘cat’ has three slots, c-a-t, and fillers for each slot come from the speech sounds of English.

The syllable is therefore itself an important part of the development of duality of patterning. It is a natural organising constraint set on the arrangement of phonemes that works to enable each phoneme to be better perceived. It has other functions, but the crucial point is that it is primarily an aid to perception, arising from the matching of ears to vocal apparatus over the course of human evolution, rather than a prespecified mental category.. A very simple characterisation of the syllable is that speech sounds are arranged in
order. The order preferred most of the time is that, from left to right in the syllable, the sounds are arranged from least inherently loud to the loudest and then back to softest. This makes the sounds in each syllable easier to hear. It is another way of chunking that helps our brains to keep track of what is going on in language. This property is called sonority. In simple terms, a sound is more sonorous if it is louder. Consonants are less sonorous than vowels. And among consonants some (these need not worry us here) are less sonorous than others.‡ Thus syllables are units of speech in which the individual slots produce a crescendo-decrescendo effect, where the nucleus or central part is the most sonorous element – usually a vowel – while at the margins are the least sonorous elements. This is shown by the syllable bad. This is an acceptable syllable in English because b and d are less sonorous than a and are found in the margins of the syllable while a – the most sonorous element – is in the nuclear or central position. The syllable bda, on the other hand, would be ill-formed in English because a less sonorous sound, b, is followed by another less sonorous consonant sound, d, rather than immediately by an increase in sonority. This makes it hard to hear, or hard to distinguish b and d when they are placed together in the margin of the syllable.

Languages vary tremendously in syllabic organisation.§ Certain ones, like English, have very complicated syllable patterns. The word ‘strength’ has more than one consonant in each margin. And the consonant ‘s’ should follow the consonant ‘t’ at the beginning of ‘strength’ because it is more sonorant. So the word should actually be ‘tsrength’. It does not take this form because English has historically preferred the order ‘st’, based on sound patterns of earlier stages of English and the languages that influenced it, as well as cultural choice. History and culture are common factors that override and violate the otherwise purely phonetic organisation of syllables.

‹ Prev Next ›