But even as this system of typographical tone of voice is developing so beautifully, it’s also under threat. When asking about the future of technolinguistic tools, like speech to text or predictive smart replies, we need to ask not just how they can be used, but how they can be subverted; not just how designers can help users communicate their intentions, but how users can help them communicate more than the designers intended. It’s all very well to be sincere when asking a voice assistant for the day’s weather, but for technology that aims to help us write messages to other people, the next great challenge is not just the words we say but how we say them. It was the subversion of autocapitalization, after all, that paved the way for ironic minimalism, and the subversion of traditional handwritten means of calling attention that paved the way for #E m p h a s i s™. For typographical tone of voice, training on formal datasets from books and newspapers is not going to be enough. This kind of subtlety must be part of the future of any system that aims to facilitate writing, and it’s not yet clear how to do so effectively: IBM experimented with adding Urban Dictionary data to its artificial intelligence system Watson, only to scrub it all out again when the computer started swearing at them.
It’s important, in the meantime, not to overstate what’s changed. Many features of internet tone of voice have been around for over three decades, if we take those 1984 Usenet posts as a starting point, and yet if E. E. Cummings or L. M. Montgomery were to pick up a modern book or read a modern newspaper, they’d see edited prose that looks quite familiar to their 1920s eyes. In formal writing, periods are still emotionally neutral,* questions and question marks still march hand in hand, uppercase still demarcates sentence beginnings and proper names, and one must still rely on clever phrasing to communicate sarcasm. (Alas ؟)
It’s not that writing has completely changed, it’s that writing has forked, into formal and informal versions. But this forking didn’t coincide with the invention of the internet, or even of the computer. All caps, expressive lengtheninggg, ~irony punctuation~, minimalist punctuation, and capitalization paired with linebreaks all have direct ancestors in the early twentieth century, not the twenty-first. Think of the minimalist punctuation and capitalization of E. E. Cummings or the stream of consciousness used by James Joyce in the last chapter of Ulysses, which is 4,391 words long and punctuated only by two periods. The principle of the stream-of-consciousness writing style was that it represented the flow of thoughts in one’s head better than rigidly conventionalized formal writing, so if we’re looking to make our writing closer to our thoughts, perhaps it’s not surprising that we’d end up sounding modernist or postmodern.
We could even trace this fork back to the beginning of grammatical typography. When grammarians decided that the scribal, pause-based punctuation needed to be reformed under the model of Latin grammar, they may have been able to change the practices of schoolteachers and editors, but they never wholly held dominion over private letters, handwritten signs, or notes left on the kitchen table. In the future, the era of writing between the invention of the printing press and the internet may come to be seen as an anomaly—an era when there arose a significant gap between how easy it was to be a writer versus a reader. An era when we collectively stopped paying attention to the informal, unedited side of writing and let typography become static and disembodied.
The internet didn’t create informal writing, but it did make it more common, changing some of our previously spoken interactions into near-real-time text exchanges. At the same time, keyboards took away some of our previous repertoire for expressive writing, like multiple underlines, colored ink, fancy borders, silly doodles, and even subtle changes in someone’s handwriting that might allow you to infer their mood. But the expanded system of conveying emotional nuance through text we’ve come up with instead is so nuanced and idiosyncratic that if I’m typing a personal sort of communication for someone—say, when I’m in the passenger’s seat and a text on the driver’s phone needs to be replied to right away—I find I need to inquire in great detail how exactly they want me to type. Period, exclamation mark, or simple linebreak at the end of the utterance? How much capitalization? Do any letters need to be repeated? Likewise, if I receive a message authored by someone other than the owner of the phone, I can often tell the difference. Expressive typography makes electronic communication anything but impersonal.
I, for one, think this change is fantastic. Even if this increased attention to typographical tone of voice did mean the decline of standard punctuation, I’d gladly accept the decline of standards that were arbitrary and elitist in the first place in favor of being able to better connect with my fellow humans. After all, a red pen will never love me back. Perfectly following a list of punctuation rules may grant me some kinds of power, but it won’t grant me love. Love doesn’t come from a list of rules—it emerges from the spaces between us, when we pay attention to each other and care about the effect that we have on each other. When we learn to write in ways that communicate our tone of voice, not just our mastery of rules, we learn to see writing not as a way of asserting our intellectual superiority, but as a way of listening to each other better. We learn to write not for power, but for love. But for all the subtle vocal modulations that typography can express, we’re not just voices. We still need a way to convey the messages that we send with the rest of our bodies.
Chapter 5
Emoji and Other Internet Gestures
Our bodies are a big part of the way we communicate.
If someone stamps into a room with a furrowed brow, slams the door, and proclaims, “I’M NOT ANGRY,” you believe their body, not their words.
If someone is sobbing and wiping away tears as they say, “No, no, there’s nothing wrong,” you don’t reply, “Great, glad to hear it, that’s a relief, let’s go dancing!” You say something like “I mean, it sure looks like something’s wrong, but if you don’t want to talk about it I understand.”
If a good friend looks you in the eye, grins, and says, “You’re the most terrible person I’ve ever met!” you don’t think, “Oops, I guess this person isn’t my friend after all.” You think, “Awesome, we’re such close friends that we can mock-insult each other and we both know we don’t mean it!”
Likewise, a lot of our language about emotion is embodied—our hearts race, our eyebrows arch, our cheeks flush, our stomachs butterfly, our throats, um, frog. Writing is a technology that removes the body from the language. That’s its greatest advantage—it’s easier to transport and store words written on paper or in bytes than embodied in an entire living human or a hologram of one. Sometimes, you wouldn’t even want the body component: just because I ambitiously decide to keep a copy of Plato’s Republic beside the toilet does not mean that I want Zombie Socrates taking up residence in my bathroom.
But the lack of a body is also writing’s greatest disadvantage, especially when it comes to representing emotions and other mental states. In the early days of going online, it seemed like we had a very clear eventual answer to the question of virtual embodiment. In the future envisioned by Neal Stephenson’s 1992 novel Snow Crash, or the 3D virtual world Second Life, launched in 2003, it seemed like we’d all be making full-bodied avatars for ourselves, with hands and feet and hairstyles, to bodily interact with each other in virtual space. The idea was that these avatars would project in cyberspace whatever we’d do in the physical world, whether logistical or emotive: thus we’d enter rooms and shake hands and roll on the floor laughing.
On a technical level, we’ve gotten quite good at projecting and manipulating a virtual body. It’s a central feature for entire genres of videogames, from first-person shooters to The Sims. But for general socializing, it never quite took off. Second Life made a lot of headlines, but it remained popular only in a smallish subcommunity of internet users, and similar efforts are even more obscure. The closest thing most of us have to a social avatar is the profile picture we use on social media a
pps—hardly the ambitious three-dimensional graphics that Second Life or Snow Crash imagined. True, profile pictures do provide some sense of who you’re talking to and what they (or their dog) look like. But they’re static. My profile pic has the same fixed, photographic smile, regardless of the message I type beside it. What we really need is a dynamic system. Punctuation is good at representing tone of voice, but we’re still missing something, something embodied. This was the void that emoticons and emoji stepped into, those smiley faces made from repurposed punctuation marks and those small pictures of faces and hearts and animals and all manner of other objects.
I first got involved with the linguistics of emoji in 2014. I’d written some articles about meme linguistics and internet linguistics, and as emoji started hitting the news in a big way (over six thousand articles were written about the emoji released in 2014 alone), I became one of the people who journalists and tech companies called up to analyze emoji. I did a talk at South by Southwest, a tech culture conference, in partnership with smartphone keyboard app SwiftKey, looking at the overall picture of how people used emoji based on SwiftKey’s billions of anonymized data points. When we put the proposal together in 2015, I was slightly worried that emoji would have blown over by the time of the conference, a whole eight months away. But instead, they were more popular than ever, and the jam-packed room of people who came to our talk agreed, as did the newspapers in six countries that reported on the talk later.
The question on everyone’s mind was: Why? Why were emoji so popular, so quickly? By the time you’ve called up a linguist to answer this question, you’ve pretty much decided that the answer is “because they’re a new language.” But as the linguist being called up, I wasn’t so sure. I was just as fascinated as anyone by emoji as a phenomenon, but linguists have a definition of what language is, and it’s very clear that emoji don’t fit in it.
Here’s a demonstration: when we were coming up with the South by Southwest talk, we spent about half a minute batting around the idea of whether we could give the talk entirely in emoji, before realizing that it would be impossible to convey anything useful or interesting that way. Even just putting the slides entirely in emoji was too much: we needed to be able to label our graphs and ask focusing questions. In comparison, I also speak French, and I could definitely have given the talk in French, even though I would’ve had to look up a few words. I could’ve also attempted to give the talk in Spanish or German, and the fact that I couldn’t give a talk in the rest of the world’s seven thousand languages is not due to any failing on their parts, simply my lack of fluency. (Alas, being a linguist has not conferred upon me the ability to speak all the languages.) Yet no matter how “emoji-fluent” we and our audience were, there was no way to give the presentation entirely in emoji: a whole hour of reciting emoji might be an interesting piece of performance art, but there was no way for it to be the funny, informative talk we’d promised. There isn’t even a clear way to say “emoji” in emoji, let alone a way to render, say, this paragraph. Real languages can handle meta-level vocabulary and adapt to new words with ease: every language has a name for itself, and many have recently acquired a word for “emoji,” just to take one salient example. Emoji aren’t capable of either.
What Are Emoji For?
Emoji aren’t the same as words, but they’re clearly doing something important for communication: I just needed to articulate what that was. Inspired by the fact that the face and hand emoji were consistently the most popular, I began talking about emoji as gesture. I made lists of common gestures and emoji to find correspondences. The lists got long: shrug, thumbs up, pointing finger, rolling eyes, middle finger, winking face, clapping hands, and so on. All of these exist in gestural and emoji form, but there were also many that didn’t: the eggplant emoji and the fire emoji didn’t have equivalent gestures, and nodding “yes” or shaking one’s head “no” didn’t have emoji. I was at a standstill.
That’s when I sent a draft of my emoji analysis to Lauren Gawne, an Australian linguist who’s also a good friend and my cohost on the podcast Lingthusiasm. She highlighted my list and commented, “You know there’s a name for this kind of gesture in the literature, right? They’re called emblems.”
I did not.
Oh, I knew Gawne did gesture research, but we’d never really talked about it much. After all, it’s not like we could gesture on the podcast. I assumed she didn’t care to talk about the gesture side of her research. She assumed I wasn’t interested. Suddenly, I was very interested.
You know how when you learn the word “schadenfreude,” and something clicks into place? You’re not uniquely terrible in occasionally taking pleasure in someone else’s misfortune: other people have been there before you. I’d been spending months and years with emoji, categorizing and analyzing them, and here was this one word that blew them all open. Someone has been here before me, a whole body of scholars, in fact, and they’d figured stuff out. I dived headfirst into the gesture literature. By the time Gawne was awake again in Melbourne, I’d exhausted Wikipedia and sent her a dozen questions. Delighted, she forwarded me the reading list for her whole gesture course.
I spent the next week in a daze. It was like I was thirteen again, encountering linguistics for the very first time: eavesdropping on people in public places with fresh ears and eyes, thoughtfully examining the positions of my hands and fingers in cafés the way that I’d once experimented with sounds and sentences under my breath in libraries. (I became utterly incapable of having a normal conversation, because I kept getting derailed by analyzing the gestures—it was hard enough when I only got derailed by the words!) When I discovered linguistics, I learned that language isn’t just a squidgy mess of opinions and impressions: there are real patterns here that I’ve been subconsciously following all along! Even if we don’t know them all yet, they’re fundamentally knowable, and there’s a whole community of people whose mission it is to figure them out. What I hadn’t realized until now was that the same thing is true for gesture. Like I can listen to a person’s vowels and plot out which parts of the mouth it takes to make them and where that means they might have lived, thanks to my linguistic training, I could also learn to spot the different kinds of gestures and what each was for.
You might ask, as I asked myself, how did I manage to get two degrees in linguistics and go to dozens of linguistics conferences but never learn anything about gesture? I’m not alone: gesture studies has been gaining ground, but it’s still a smallish subfield. There are some universities that have gesture linguists and offer a course or a unit on gesture, but many still don’t: Gawne happened to go to universities that had gesture studies, and I happened to go to ones that didn’t. If there were gesture talks at any conferences I went to, I didn’t have enough context to know why I should make a point of attending. Neither, Gawne and I suspected, did many other linguists, because we hadn’t seen anyone else drawing these parallels between emoji and gesture either. So she started in on an academic paper using the examples I’d collected, and I rewrote this chapter with her guidance on the literature.
I’ve always been a sucker for a good taxonomy of everyday life, and gesture gave me a great one. But what was even better was that the same taxonomy worked just as well to describe how people use emoji. This was the missing link. Looking for a grand unified theory of emoji had been dooming me to failure because emoji don’t just have one function, they have a range of them. But crucially, it’s the same range that gestures have, and that’s why emoji caught on so quickly and so completely: because they gave us an easy way of representing the functions behind the gestures that are so important for our informal communication. Without realizing that either gestures or emoji were potentially systematic, a couple billion internet users had subconsciously, collectively, and spontaneously mapped the functions of the one onto the capacity of the other.
Let’s get back to emblems, the word that cracked everything open. I’d been making a
list of gestures, like thumbs up, waving, winking, shrugging, jazz hands, rolling one’s eyes, giving someone the finger, tugging out an imaginary collar to indicate awkwardness, playing a metaphorical tiny violin in false sympathy, brushing imaginary dirt off one’s shoulders, dropping a metaphorical mic, making a heart with one’s fingers, and so on. Many of these gestures have direct emoji equivalents: peace sign and thumbs up and crossed fingers and rolling eyes and winking .
But what I’d also been doing, without realizing it, was making a list of gestures that have common names in English. I don’t have to describe to you that a wink involves the deliberate closing of one eye or that a thumbs up involves the four fingers curled in on the hand, the thumb protruding and oriented to the sky, with the palm of the hand facing towards the speaker—as a speaker of English, these are things you already know. I was doing this for purely practical reasons (describing gestures in detail takes effort), but it turns out that these nameable gestures have some important things in common. Many theorists call them emblems, the way that a Jolly Roger is an emblem of a pirate. Emblem gestures can all fit easily into a linguistic frame (try any of them in a sentence like “If we’re late, then ___”), but they’re also perfectly meaningful without speech at all. The same thing goes for many emoji (you can say “If we’re late, then ” or “If we’re late, then ”), but it’s also sufficient sometimes to reply with just a thumbs up or eye-rolling emoji.
Because Internet Page 16