131 The research on the physiology of confession involved people talking about the most traumatic experience of their lives into a tape recorder. All potential identifying information from the story has been changed. From an article by Pennebaker, Hughes, and O’Heeron (1987).
134 For anyone seeking a discussion of self-deception in English literature, I recommend the writer and scholar Jim Magnuson. His wisdom on this and other topics has been invaluable to me in our discussions over the years.
140 Somewhere between self-deception, deception, and marketing is the concept of “spin.” In the political world, spin is defined as the art of glossing over the truth. Indeed, after an important debate or speech, political operatives will often meet with members of the press to spin the speech of their own candidate to make it sound even better than it was and spin the opponent’s words to sound worse. In fact, a cutting-edge computational linguist at Queen’s University in Canada, David Skillicorn, has developed a model to identify spin based on linguistic features associated with deception (research.cs.queensu.ca/home/skill/election/election.html).
140–141 In his Introductory Lectures on Psychoanalysis, Freud devotes a surprising amount of time to slips of the tongue, or, as he refers to them, parapraxes. In retrospect, you can understand why he used parapraxes as a way to introduce his general theory. Everyday slips of the tongue are common and reveal certain truths about what people are really thinking.
142–143 I’m indebted to Melanie Greenberg for allowing me to reanalyze her language data.
144–146 A fascinating account of the Stephen Glass case is available at www.rickmcginnis.com/articles/Glassindex.htm. For the analyses, I only examined thirty-nine of his forty-one stories (one was co-written by someone else and the other was made up exclusively of published quotes).
Somehow relevant to this discussion is a wonderful quote by author Mary McCarthy. In 1979, she was interviewed by the television host Dick Cavett about the author Lillian Hellman, who had a reputation for fabricating some of her stories. When asked about Hellman’s work, McCarthy snorted, “Every word she writes is a lie, including and and the.” Hellman subsequently filed a defamation suit against McCarthy, which was dropped after Hellman’s death in 1984.
149 There are several other fascinating papers that are worth mentioning in the literature on high-stakes, real-world deception detection using computerized text analysis. For example, David Skillicorn and his colleagues, as well as Max Louwerse, Gun Semin, and their colleagues, along with other labs around the world have examined the e-mails between Enron employees during the decade leading up to the company’s bankruptcy in 2001 due to its fraudulent accounting activities. In addition, David Larcker and Anastasia Zakolyukina used computerized text analysis to classify deceptive versus truthful chief executives in their quarterly earnings conference calls.
152–153 The project by Denise Huddle and me has not yet been submitted for publication. One additional finding is particularly noteworthy. Recall that I-words signal innocence. Closer inspection indicated that it wasn’t all I-words. In fact, the more that defendants used the actual word I (and I’ll, I’m, I’d, etc.), the more likely they were to be innocent. In fact, use of the word me was used more by the truly guilty.
155 The dating project is by Toma, Hancock, and Ellison (2008).
165 Discrepancy or modal verbs identify words that make a distinction between an ideal and a real state. “I should be eating vegetables” indicates that I’m not eating them (reality) but the ideal person would be. Columbia University psychologist Tory Higgins has developed an elaborate theory around the self-discrepancy idea that has implications for goals, motivations, and mental health. The ideal-real discrepancy is also inherent in Robert Wicklund’s work on self-awareness.
167 I love performatives. Not only are they psychologically interesting but they are at the center of some eye-opening debates in philosophy. Check out the work of the philosopher John Searle and also John L. Austin.
168 Although not discussed here, another verbal feature of deception is the use of words such as um or er. A recent article by Joanne Arciuli and her colleagues is worth reading. Also, see Michael Erard’s book Um.
CHAPTER 7. THE LANGUAGE OF STATUS, POWER, AND LEADERSHIP
176 Recall that pronouns tell us where people are paying attention. Many of the pronoun effects we see with humans match the gaze finding among nonhuman primates. For a delightful analysis of status hierarchies in chimpanzees, see the work of Frans De Waal. In humans, visual cues to dominance have been discussed by Jack Dovidio and his colleagues.
188–189 John Dean, personal communications, August 30, 2002.
190 I’m indebted to Ethan Burris and his colleagues for allowing us to reanalyze their data for this project.
192 The leadership literature has grown increasingly complex. To get a flavor of some of the directions, see the pioneering work of Fred Fiedler. A summary of research dealing with leadership among women and men is best captured by Alice Eagly and her colleagues. David Waldman and his group have done a nice job of examining leadership attributes.
194 Another example of language shifts across languages and cultures was discovered by Doug Sofer as part of his dissertation. He found that the use of the first-person singular in letters to the presidents of the South American country Colombia between 1944 and 1958 differed by social class. As you might guess, the lower the social class of the author, the higher their use of I-words. See also the seminal work of Howard Giles on accommodation theory, a framework for understanding when people shift in their language to match their interaction partners.
194 Thanks to George Theodoridis for his observations on Ancient Greek language (personal communication, November 6, 2009).
195 Note that there are other language dimensions that we find to be linked to status. Although the effects are modest, people with lower status tend to use the following word categories more: negations, impersonal pronouns, tentative words, swear words.
CHAPTER 8: THE LANGUAGE OF LOVE
197–198 The IM interaction was part of the Slatcher and Pennebaker (2006) project. The on-air fight between Elisabeth Hasselbeck and Rosie O’Donnell that follows had been brewing for several months. O’Donnell had already announced that she was leaving the show in three weeks. The overall interaction between the two women evidenced a language style matching (or LSM) score of .94—which is exceedingly high.
202–204 The research on the mirror neuron system continues to be controversial. There is an increasing number of studies that demonstrate highly specialized brain activity in Broca’s area that reflects behavioral mimicking. The primary objection about the mirror neuron approach is that no consistent theory or model explains how it works or how it is related to cognitive activity. Particularly interesting papers are available by Rizzolatti and Craighero (2004), Kimberly Montgomery and her colleagues (2009), and Kotz et al. (2010).
207 The LSM project dealing with liars was conducted by Hancock, Curry, Goorha, and Woodworth (2008). For a report on additional text analyses of the same data, see Duran, Hall, McCarthy, and McNamara (2010).
208 Despite the intuitive appeal of multitasking, the evidence is clear that it is not an effective technique to accomplish even moderately complex tasks. A particularly convincing case of the downside of multitasking has been published by Ophir, Nass, and Wagner (2009).
211 The speed-dating project had a complicated history. Paul Eastwick, a faculty member at Texas A&M, visited our department in the spring of 2010 to describe some speed-dating research he had been doing with a colleague of his, Eli Finkel, who is at Northwestern. Molly Ireland was fascinated by his talk and asked if he would be interested in applying the LSM methodology to the speed-dating transcripts. Within a few days, Molly’s analyses yielded the remarkable finding that LSM in speed-dating conversations was a powerful predictor of later dates. Molly then added the speed-dating analyses to Richard Slatcher’s IM project (see p. 212) and, in record time, submitted the p
aper to a top journal, where it was accepted and published. The resulting paper is Ireland, Slatcher, Eastwick, Scissors, Finkel, and Pennebaker (2011).
212–215 The IM project was initially published as Slatcher and Pennebaker (2006) and then, with the reanalyses of the data, as Ireland, Slatcher, et al. (2011).
216 John Gottman’s research on relationships has a number of practical applications for making good marriages. In addition to his books and articles, New York Times writer Tara Parker-Pope has written a balanced book on marriage and relationships that relies on some of the most recent research.
218–223 The analyses of Elizabeth Barrett and Robert Browning, Sylvia Plath and Ted Hughes, and Sigmund Freud and Carl Jung were part of a paper published by Molly Ireland and me in 2010.
CHAPTER 9: SEEING GROUPS, COMPANIES, AND COMMUNITIES THROUGH THEIR WORDS
228 Several studies have tracked language use and its relationship with successful marriages. Not surprisingly, use of pronouns, especially we-words, between the couples is a reliable predictor. See the work of Seider and colleagues (2009) and of Rachel Simmons, Peter Gordon, and Diane Chambless (2005).
229 The project linking pronoun use among couples and heart failure was conducted by Rohrbaugh and colleagues.
The Sexton and Helmreich project focused only on flight simulation studies. Later analyses by Brian Sexton found links between low we-word use and human error in the cockpit voice recordings of planes that had crashed (personal communication, April 20, 2010). See also the work of Foushee and Helmreich.
232 One of the more interesting approaches to studying natural interactions was pioneered by Bill Ickes, a social psychologist at the University of Texas at Arlington. In a typical study, pairs of students are instructed to visit Ickes’s research lab to participate in a conversation. After both complete questionnaires and a consent form to be videotaped, the experimenter tries to begin filming and then “discovers” that his camera is broken. The experimenter leaves the lab, claiming he’s going to find a technician. The students remain in the lab and usually begin talking with one another.
What they don’t know is that another hidden camera is taping their interaction. Later, the students are told about the hidden camera and are asked to rate their interaction on a minute-by-minute basis. Ickes is able to see how the two people were thinking about each other as their conversation unfolded. Bill has kindly allowed us to analyze some of his interactions. I strongly recommend his recent book, Strangers in a Strange Lab.
And while we are talking about real-world approaches to studying the behavior of people, I insist that you check out the work of Sandy Pentland and Roz Piccard, who are at MIT’s Media Laboratory. Together and separately, the two have devised a striking number of methods that track how people see and emotionally react to their worlds as they go about daily life.
232–235 One way to think about the increase in we-words over time is that the longer people talk with others, the more their identities become fused. Bill Swann and his colleagues have been conducting a number of imaginative projects tracking identity fusion. For example, making people more aware of their own group increases the likelihood that they will endorse fighting and dying for it.
233 The national defense project was run by Andrew Scholand, Yla Tausczik, and me and funded by Sandia National Laboratory. The research tracking twenty professional therapists over three years was conducted by Susan Odom and Stephanie Rude. The findings are reported in Odom’s dissertation, which was completed in 2006.
234–235 Drops in suicide rates following terrorist attacks have been reported by Emad Salib and his colleagues. Additional findings about language and psychological changes following the subway bombings in Madrid in 2004 have been reported by Itziar Fernandez, Dario Paez, and me. The language changes in written essays among New Orleans residents after Hurricane Katrina were collected by Sandy Hartman.
238–239 A former graduate student of mine, Amy Gonzales, conducted a complex laboratory experiment where groups of students had to work together either in face-to-face groups or in online groups. The details are reported in Gonzales, Hancock, and Pennebaker (2010). A second project, which was described earlier, was run with business school students by Ethan Burris and his colleagues. The two lab studies are consistent with some fascinating real-world projects conducted by Paul Taylor and his colleagues. For example, Taylor found higher LSM levels in the transcripts of successful hostage negotiations between police and hostage-takers in the UK relative to unsuccessful hostage negotiations.
240–243 The Craigslist project is part of a larger study focusing on measures of community cohesiveness. The primary team members include Cindy Chung, Yla Tausczik, and me. We are indebted to Mark Hayward for his help in providing the relevant Gini statistics.
243–247 The word-catching research is based on an archive of tape recordings I have collected between 1990 and 2010. They include the anlayses of 1,162 conversational files of people in the real world having natural conversations. Discriminant analyses (for you statistics fans out there) show that cross-validation classifications are accurate at 80 to 84 percent for anywhere from five to seven settings, where 16 to 20 percent is chance.
248 One of my favorite language maps tracks the usage of the words pop, soda, and Coke as generic names for soft drinks. Check out www.popvssoda.com.
248–253 One of the giants in the world of sociolinguistics is William Labov from the University of Pennsylvania. Labov has pioneered ways to track how word usage and accents change across regions and time. Some of his early work, for example, examined language differences within blocks and neighborhoods of large cities. Later, he began to focus on much broader trends across the United States.
Due in large part to Labov’s influence, the University of Pennsylvania has taken an important lead in advancing our knowledge of social communication and language use. It houses the Linguistic Data Consortium, or LDC (www.ldc.upenn.edu), which houses one of the largest text archives in the world. In addition, Mark Liberman—a particularly thoughtful linguist—has created Language Log, a highly influential blog site (languagelog.ldc.upenn.edu).
249–251 The This I Believe project has been growing in multiple directions. Cindy Chung, Jason Rentfrow, and I have been developing detailed maps of language use across the United States based on both function words and content words.
251–252 A particularly hot approach to text analysis examines how people use emotion words in their blogs, tweets, or other communications. Although sentiment analysis focuses only on people’s use of positive and negative emotion words, it can provide a general overview of the happiness of cities, regions, or entire countries. For a discussion, see the work of Adam Kramer, Jason Rentfrow, and also Alex Wright’s article in the New York Times. Also, check out a truly wonderful book by Eric Weiner, The Geography of Bliss, on one man’s attempt to understand why some countries are happier than others.
252 In deducing the linguistic fingerprint of the Texas high schools, discriminant analyses showed that we could accurately classify students at a 19 to 20 percent rate, where 11 percent was chance.
CHAPTER 10: WORD SLEUTHING
258–261 Matching blog entries to specific authors can be done in a number of ways. In the chapter, we try to match blogs written today with those written many years ago by the same authors. This is much harder than matching blogs written by authors at about the same time. In fact, think back to the example of the twenty bloggers. Imagine we have, say, ten blog entries on consecutive days from each of the twenty people. We pull out one of the ten entries for each person and put this into a separate stack. The goal is to match the twenty “orphan” entries with the twenty bloggers by reading the nine blog entries of known authorship. Our computer does a much better job at guessing which orphan entry goes with which blogger. The overall hit rate is closer to 58 percent (where 5 percent is chance).
262–265 In addition to the work of Adair and of Mosteller and Wallace dealing with the Federalist Papers,
be sure to see recent articles by Patric Juola (2006) and by Jeff Collins and his colleagues (2004).
265 Pardon me for a minute while I have a little chat with the twenty people on Earth who really, really want to know the methods for analyzing the Federalist Papers. The cross-validation approach is based on discriminant analyses assuming equal group size. The original function-word assignment method, which assigned all unknown texts to Madison, correctly classifed 92.4 percent of the original essays and 86.4 percent for cross-validation. The numbers for function words plus punctuation were 98.5 percent and 84.8 percent. Analyses based on the fourteen “tell” words used a binary procedure (was the word used or not within an essay) and yielded both classification and cross-validation accuracies of 98.5 percent. The one assignment error was for essay forty-one, which is attributed to Madison. The tell-word analyses estimated that Hamilton was the author of 49, 52 through 57, and 63, and that Madison was the author of 50, 51, and 62.
Whereas Hamilton claimed credit for all eleven of the unknown manuscripts, he reported that three additional ones were jointly written by Madison and himself. Madison’s later recollection was that he (Madison) had written them with some supplemental comments by Hamilton. All linguistic analyses show that the jointly written papers were completely different from either Hamilton’s or Madison’s solo-authored pamphlets. Given this, I tend to side with Hamilton’s accounts of the authorship issue rather than with Madison’s.
265–267 A recent project by Terry Pettijohn and Donald Sacco (2009) analyzed the lyrics of number one Billboard songs between 1955 and 2003. They discovered that during economic downturns, people preferred lyrics that were more complex, social, and future oriented.
268 There are several ways to determine if collaborations result in average or synergistic language use. Consider how John Lennon and Paul McCartney used present-tense verbs in their lyrics. For their individually written songs, Lennon consistently used more than McCartney (15.8 percent versus 13.7 percent). According to the average-person hypothesis, their collaboration should have resulted in songs that ranged between 13.7 and 15.8 percent present-tense verbs. In fact, the Lennon-McCartney eyeball-to-eyeball collaborations resulted in songs with 17.6 percent present-tense verbs. In this case, Lennon was somewhere between McCartney and Lennon-McCartney—the average writer. We can calculate the percentage of time that Lennon, McCartney, and Lennon-McCartney produced songs that were in the middle of the other two linguistically. The author who was statistically the average person for the Beatles was: 50.6 percent for Lennon, 36.1 percent for McCartney, and 13.3 percent for Lennon-McCartney. The statistically average author for the Federalist Papers was: 39.5 percent for Hamilton, 53.9 percent for Madison, and 6.6 percent for Hamilton-Madison. In other words, when collaborating Lennon-McCartney and Hamilton-Madison were far more extreme than either author on his own.
The Secret Life of Pronouns: What Our Words Say About Us Page 31