by John Simpson
And how could we police that? Twenty thousand pages of type, each with three columns of tiny print. And it was difficult text. It wasn’t Jane Austen. There were swathes of quotations dating all the way back from the Old English period up through Chaucer, Shakespeare, and all stations to the present day: it was not the sort of text with which the normal keyboarder in Florida would be familiar. And once the text was keyed, we then had to proofread the whole thing—which was a major project in its own right. Managing the keying and proofreading operation in Oxford became the responsibility of my colleague Yvonne, who set up a band of fifty freelance proofreaders to do the job. They could be seen shuffling into the office every now and then to collect their work and to hand back what they had just completed. And all this needed to run precisely to the clock, given the tight deadline we were facing.
In the mid- to late eighteenth century, the verb to transpire caused no end of arguments between otherwise healthy individuals. There are people who think that words should mean what they used to mean, and that any deviation from this is heresy (the “etymological fallacy”: that nice is still somehow related to its origin in Latin nescius, “ignorant”; or that logic is only argument over “words,” from the Greek logos, a word). Transpire (or at least its aged equivalent) has a literal meaning in classical Latin, but over the centuries English speakers have, to use the technical word, mangled this. The word transpire is known in English from the latter end of the sixteenth century, and it derived from Latin transpirare: trans- as in “across” (Trans-Siberian, etc.) and spirare as in “to breathe” (inspiration, spirit, etc.). So you would expect it to mean something to do with transmission by breathing.
Here’s how the OED views that old meaning: “To emit or cause to pass in the state of vapour through the walls or surface of a body; esp. to give off or discharge (waste matter, etc.) from the body through the skin.” As we move through the seventeenth century the range of contexts in which the term could be employed grows, but the core meaning remains constant: perspiration comes into it rather frequently—liquid passing from inside to outside or from outside to in. It turned out to be quite a useful word in the emerging sciences of the Early Modern period, and was heading for stardom.
The first hiccup on the road to immortality occurred in 1748. It concerns Lord Chesterfield, who—as a style leader—later annoyed the volatile Dr Johnson by withdrawing support from his proposed dictionary when the noble lord realised that it was veering off plan (i.e., Johnson had accepted that language changes and wasn’t necessarily inching nearer and nearer to perfection, and that dictionaries should be open to this). In 1748 Lord Chesterfield (rather ironically and despite his general qualms on language change) decided to use the verb transpire in a figurative way that was new to English (though not to French transpirer), when writing to one of his correspondents. This is what he wrote: “This letter goes to you, in that confidence, which I . . . place in you. And you will therefore not let one word of it transpire.”
Now there is absolutely nothing wrong with that (should you be the sort of person who finds things “wrong” with language), and the French had developed this use slightly earlier in the eighteenth century. What Lord Chesterfield was saying was that he did not want one word of the contents of his letter to permeate from its current private, secret state through to public view. The development from the physical transpiration to the metaphorical one is easy and unexceptionable. But when he came to address the word in his dictionary, Dr Johnson rather pompously found even this minor semantic shift too much (“a sense lately innovated from France without necessity”).
What happened next, though, set language purists into a deep decline. According to the OED, it was an American lady—Abigail Adams, wife of the second president of the United States—who is credited with writing in 1775 to her husband at the Continental Congress in Philadelphia: “There is nothing new transpired since I wrote you last.” I’m sure others used it in this way before she did, but at the moment she has all the credit.
Language purists hated this new meaning “to occur, to happen.” What, no permeation—or at least only permeation and transmigration of the very loosest variety, in the sense of something moving from one state (nothing happening) to another state (something happening). Organic change like this should not happen in a polite eighteenth-century salon. The fact that it might well be an Americanism—and Americans were not really top of anyone’s dance card in Britain around 1775/1776—probably made the usage even less popular in Britain than it might have been. The First Edition of the OED despairs: it is a “misuse.” The dictionary offers some assistance: “Evidently arising from misunderstanding such a sentence as ‘What had transpired during his absence he did not know,’” which is itself a rather enigmatic and confusing way of explaining something. In observing the usage, the OED was following in the footsteps of earlier lexicographers, such as the American Joseph Worcester, who wrote in 1850: “This novel use of the word is pretty common in the United States; nor does it appear to be very uncommon in England, though it has been repeatedly censured by judicious critics, both here and there, as improper.” Worcester didn’t seem to mind too much, though. It transpires.
In the autumn of 1985, I was invited—along with Hilary and three-and-a-half-year-old Kate—to Waterloo, Canada, on the notion that—as an OED expert—I might advise the Waterloo computer gurus on their job of thinking theoretically about possible structures for the planned OED database. Actually, my role was simpler than that. While the gurus were discussing how best to organise the dictionary’s mass of data inside a computer, I was expected to supply in-depth information on the structure and content of the print version of the dictionary, in case by any chance it had implications for their work.
At the same time I was to be addressed as visiting assistant professor in the Waterloo English Department, and was expected to hold postgrad classes on the history and practice of lexicography. Hilary was also invited to teach on her speciality: freshman 101 with particular reference to the modern English novel. Kate was expected to go to day care: she remembers clearing snow from the drive from her time in Canada, but she doesn’t remember the educational benefits of day care.
I love the conventional eighteenth-century spelling gooroo. Guru is another of these Early Modern English words (so it gets an extra tick from me). Like juggernaut, it is one of many Indian words that early travellers and traders encountered in the Indian Subcontinent. The austerity of the gurus’ lifestyle, and the deep respect in which they were held, struck the Europeans as surprising—the wise men that they knew didn’t look quite like the Indian guru. So naturally they wanted to tell their audience back home about this new phenomenon. The OED defines guru primarily as “a Hindu spiritual teacher or head of a religious sect,” from 1613 (the same year as the first “computer”). Samuel Purchas, an indefatigable recorder of travellers’ tales at the time, refers to “a famous Prophet of the Ethnikes, named Goru.” The dictionary sensibly thinks that this is a common noun, not the gentleman’s personal name.
Normally Indian gurus seem to be slight, thin gentlemen. This is not always the case with computer gurus, where too much sitting down at the keyboard and being sedentary broadens the beam. In twentieth-century English we started to extend the meaning of guru from its original Hindi sense. H. G. Wells writes: “I ask you, Stella, as your teacher, as your Guru, so to speak, not to say a word more about it,” in his Babes in the Darkling Wood (1940). After that, the word was available for use generally as “any influential teacher or mentor.” Computer administrators at the time rather liked to be called “gurus”—not to their face, necessarily, but they liked to know it was going on out of their earshot.
These were exciting days for the digitisation of the OED. The dictionary had previously often been the subject of informed comment and criticism from literary scholars, but in Waterloo the way that the dictionary held and presented its information was rigorously probed by computational logicians. They asked me simple
but awkward questions about the dictionary that I hadn’t necessarily heard before. Like: “What is the generic structure of a definition?” or “Would it be useful to link from an entry to full, digitised versions of the texts it references?” “Would you always want to present entries in alphabetical order, or should there be other options?” “Would you ever want a computerised assessment of which entries are most out of date?” “Do you want us to tell you computationally which twentieth-century compounds seem common enough in real text to be added to the dictionary?” I knew that my answers would help shape the OED of the future. It was a good match, and a very productive one.
Waterloo had already devised some lightning-fast search software for the dictionary. (This was years before software of this kind was available off the shelf.) Although the software searched almost instantaneously over the massive amounts of text residing in the dictionary, the dictionary’s text was complicated enough that a human brain (in this case mine) was needed to unscramble problematic issues.
The problems we tussled with were esoteric, but they had to be resolved if we wanted to be able to search the digitised database more efficiently and discover new information about the language that was just lying hidden at the moment, waiting just below the surface. Take dates, for example. Traditionally, if the OED couldn’t allocate a precise date to one of the texts it was citing in support of a word (often an undated manuscript text deriving from before the invention of printing), then it would date it imprecisely—say, circa 1400 (“around about the year 1400”). That caused no problem to human readers consulting an entry. They used their common sense. But that was no use to a computer, because they weren’t born with common sense. The computer didn’t instinctively know what circa implied, and it wanted a real number or number range to work with. So we had to come up with one. After experimentation, we devised an idiosyncratic system which told the computer that at one period in early history (when you generally only had a very sketchy idea when the manuscript was actually composed), then circa should be assumed to imply a date range of fifty years, but that when you were dealing with post-medieval texts (and you might generally be more confident of your ability to date the text), then circa might suggest a twenty-five-year spread, or later in time just five years. It wasn’t perfect, but with that information nestled in its software, the computer could work happily with the dictionary’s apparent inexactitude.
I was introduced in Waterloo to a culture in which texts were the object of a new form of analysis: where looking for patterns in language was paramount. We began to think about how computers could be taught to extract information from texts so as to improve dictionary content and to benefit language analysis generally. Obviously we weren’t alone in this, but it was something of a revelation for me, coming from a long-standing editorial project in Oxford that was really a throwback to the 1950s, when the Supplement to the OED had originally been established.
All of a sudden we were familiarising ourselves with a host of new ways by which we could investigate the dictionary. Shakespeare is always a good example. We looked at all the words and meanings said to have been coined by Shakespeare (around 8,000 in all): What could that tell us about authorial creativity (was it greatest in nouns, adjectives, verbs?)—Was Shakespeare’s lexical creativity most active in his early work or later, in the comedies or the tragedies (we found 210 neologisms in A Midsummer Night’s Dream, but 480 in Hamlet)? We revealed that Shakespeare was credited with augmenting the word-stock of English with meanings for 2,900 nouns (such as partner = dancing partner), 2,350 adjectives (Nestor-like), 2,250 verbs (to waddle, applied to a person), 146 phrases (too much of a good thing), 40 interjections (bow-wow!), 39 prepositions, etc. We discovered how many English words had been coined between 1600 and 1610 (8,400, at least according to the OED’s record). We looked at the aggregate number of nouns, verbs, and other parts of speech before this date and after: Was there a difference, and, if so, what did that tell us about the language? Did we lose many words in the fifteenth century through obsolescence? If so, what sort of words were these? Why did we lose them? What was happening to our society to cause this to happen? And how do those words compare with the ones we gained over the same period? Sometimes the data wasn’t really strong enough at the time to answer our questions, but it was exhilarating just to think of the questions, let alone discover the answers.
It was in discussions of this sort that I spent lazy afternoons with the computer scientists at Waterloo. There were two gentlemen who led the pack: Frank Tompa and Gaston Gonnet. Frank knew considerably more mathematics and computer science than was good for him, and Gaston knew the rest. Frank was so good at it that he’s had a street in Waterloo named after him, and Gaston had already invented his own computing language, which he named, with the true generosity of a Uruguayan to his adopted country, Maple. The professors were as excited about the project as the lexicographers were. We thought of it in the same breath as the Human Genome Project, getting off the ground at around the same time: pattern-matching through huge swathes of data was crucial to both adventures.
One of the most unexpected things about going to Waterloo was that suddenly I was in an environment where lexicography was regarded as an important new area of progressive language research led by a computerised OED. The computer science professors in the University of Waterloo’s new Centre for the OED were helping to wrestle the OED towards a brave new future.
My English Department master’s students were also curious about and excited by this new field of research, and not fixated on the fusty British stereotype of lexicographers. In England, the media often couldn’t see beyond the sepia images of OED lexicographers as old men with trailing white beards, amnesia, and chalk dust. My postgrad class investigated the history of antique English lexicographers, from Robert Cawdrey in the early seventeenth century, through Dr Johnson, to Noah Webster (not a beard among them), but they also had access on campus to the early reaches of the OED on computer. They soon came to realise that examining this database could give their literary and language studies a new perspective.
In those days, there were only a few dictionary courses available in universities and colleges. Before the digital revolution, historical dictionaries were something of a cosy end in themselves, admired by literary scholars but often overlooked by others whose interest in language would be gripped by the new technology. We needed to look at the tradition of dictionaries before we broadened out into options for the future.
Dictionary history can be taken all the way back to Akkadian/ Sumerian wordlists in the third millennium BC. But to do so is not particularly enlightening for the student of the English dictionary, so we started our course at a more recent date.
There are several basics that you need to know. Bilingual glosses, and then bilingual dictionaries, predate monolingual dictionaries such as the OED, because people encountered a need to understand foreign languages before they worried unduly about the definition of words in their own language. The first language that the early English needed to gloss was Latin, the language of the early church and of instruction. In order to assist their pupils’ understanding of the Bible, Anglo-Saxon teachers would write the English equivalents of Latin words (or sometimes just simpler Latin words) between the lines of their manuscript texts. These glossaries are very rare, but we know of several works of this nature. The surviving Leiden Glossary, for example, was copied in modern-day Switzerland around 800 AD from a lost Anglo-Saxon manuscript.
But there is little doubt that, despite indicating a need for reference texts to help understand language, the interlinear glosses of medieval manuscripts were never going to be international successes. We had to await the invention of printing, in the mid- to late fifteenth century, before a medium was available through which dictionary (and other) texts could be made more readily accessible.
Even then, the British were not ready for their first English dictionary. The medievals had been more interested in classifying knowledge
in thesauruses (literally “treasure houses” of knowledge) than in dictionaries. But the need for translation remained strong. In the sixteenth century, bilingual dictionaries (involving Latin, English, French, Italian, or Spanish) fed a market in Europe (including Britain) eager to make sense of the changing world: important titles at this time include the Latin-English Dictionary of Sir Thomas Eliot (1538), which was published in many editions throughout the sixteenth century, as well as Claudius Hollyband’s Dictionarie French and English (1593) and John Florio’s Italian-English Worlde of Wordes (1598).
We first meet the word dictionary in the thirteenth century, in its Latin form dictionarium. But at this stage it wasn’t used in the way we know it today. It was introduced by the Parisian teacher John of Garland (born in England, it should be noted) as the name for his elementary guidebook to Latin composition. The OED and others have investigated this matter in some depth, and report that the author’s introduction states (in translation) that the book “is called Dictionarius, not from [Latin] dictio in the sense of ‘single word’ but from dictio in the sense ‘connected speech.’” So John of Garland intended it to mean a book explaining the mysteries of “connected” Latin composition, not a “word-list” with glosses or definitions.