People still flame one another See Todd Dugdale, “Sandbaggers and Trolls,” kd0tls Ham Radio Experience, January 6, 2014, kd0tls.blogspot.com/2014/01/sandbaggers-and-trolls.html/.
The government has the greatest vested My discussion of government surveillance of unrest, and the work of Peter Gloor at MIT, draws from “What Makes Heroic Strife,” Economist, April 21, 2012, economist.com/node/21553006/.
27.5 percent of Twitter’s 500 million tweets This number is from analysis of my randomized research sample.
Facebook’s data team Facebook’s data analysis is always done with anonymized and aggregated data. This discussion of iterations surrounding the “No one should …” meme, and the attendant table, was drawn from Lada Adamic et al., “The Evolution of Memes on Facebook,” January 18, 2014, facebook.com/notes/facebook-data-science/the-evolution-of-memes-on-facebook/10151988334203859. The post leaves it unclear how political bias was determined. My best guess is from users’ “like” patterns.
1In 1950 This paragraph discussing polarization in American politics is based on Jill Lepore, “Long Division,” The New Yorker, December 2, 2013.
“It has always been a mystery” I read Life of Mahatma Gandhi by Louis Fisher (New York: Harper & Brothers, 1950) in 2007, and this quote has stuck with me since.
Chapter 10: Tall for an Asian
To find out what’s actually special to a particular group The method for reducing a group’s collected profile text to the idiosyncratic essentials I present in this chapter is my own. However, the OkCupid blog post that inspired this work—“The Real Stuff White People Like”—used a different method, developed with help from Max Shron and Aditya Mukerjee. I would not have developed my own method in this book without their prior example for that post. I developed my own method because the one used for that post had me sorting the nonsense from the “real data” as the final step. For this book, I wanted something completely algorithmic, where no human selection came into play. The method is as described—you plot the words and phrases on the grid by their percentiles and then rank them by their Euclidean distance from the desired corner of the square.
The human element came into play only in the few cases where redundant phrases, such as “my blue eyes and,” “blue eyes and,” and “my blue eyes” appeared on the list together. In those cases, I took the most representative word or phrase and deleted the others. The lists were not meaningfully altered by this. The method considered all phrases of four words or fewer that appeared in more than thirty profiles.
Because of space considerations three lengthy entries were pared down to avoid line wrapping. In the male antithesis table I used “follow me” instead of “follow me on instagram.” In the female antithesis, I used “malcolm x” instead of “biography of malcolm x,” and in the words by orientation table in the next chapter I used “feminine women” instead of “attracted to feminine women.”
something called Zipf’s law I was familiar with power law distributions already. However, I used the “Zipf’s law” Wikipedia page for more information on the law. “Zipf’s Law and Vocabulary,” by C. Joseph Sorell, The Encyclopedia of Applied Linguistics, November 5, 2012, was also a resource. The table in the text was excerpted from a longer table presented in that paper.
The Irish and eastern Europeans From Nell Irvin Painter’s The History of White People (New York: W. W. Norton, 2010).
in Mexico I lived in Mexico for several years as a child and have retained an interest in its politics. See Ronald Loewe, Maya or Mestizo?: Nationalism, Modernity, and Its Discontents (Toronto: University of Toronto Press, 2010).
“From empathy and sexuality” See Bobbi J. Carothers and Harry T. Reis, “Men and Women Are from Earth: Examining the Latent Structure of Gender,” Journal of Personality and Social Psychology 104, no. 2 (2013): 385–407. “Men Are from Mars Earth, Women Are from Venus Earth” is the title of the article’s précis: sciencedaily.com/releases/2013/02/130204094518.htm.
Aristotle looked to the emptiness I was already familiar with the heavens’ role in Einstein’s and Newton’s work. For the third, older, example, I hunted around Wikipedia until I found an example I liked. See the entry for “Aether (classical element).”
Chapter 11: Ever Fallen in Love?
A few years ago a couple of MIT students Here, I used “Project ‘Gaydar,’ ” by Carolyn Y. Johnson, Boston Globe, September 20, 2009, and the students’ original paper, “Gaydar: Facebook Friendships Expose Sexual Orientation” by Carter Jernigan and Behram F. T. Mistree, First Monday 14, no. 10 (2009), firstmonday.org/article/view/2611/2302.
The Kinsey Report in 1948 See Wikipedia’s “Kinsey Reports” entry, which summarizes the male and female editions of Kinsey’s work. The 10 percent number for men is straightforward. There is less certainty in the report around women’s sexuality. The report says 2 to 6 percent of females aged twenty to thirty-five are “exclusively” homosexual.
Later studies See Wikipedia’s “Demographics of sexual orientation” for all kinds of numbers. Also see “LGBT demographics of the United States.”
“This work can usefully” Dan Black et al., “Demographics of the Gay and Lesbian Population in the United States: Evidence from Available Systematic Data Sources,” Demography 37, no. 2 (2000): 139–54.
This surely involves a painful choice See Assi Azar, “Op-ed: To You There, in the Closet,” The Advocate, April 16, 2013, advocate.com/commentary/2013/04/16/op-ed-you-there-closet/.
no more unusual than naturally blond hair My source is Professor C. George Boeree, of Shippensburg University. See his post “Race” at web space.ship.edu/cgboer/race.html. Even back-of-the-envelope math proves his point: there are roughly 1 billion Europeans, Canadians, Americans, and Australians on Earth. If 1 in 6 of them is naturally blond, which in my personal circle would be a vast overestimate, that’s 2 percent of the world right there.
According to Stephens-Davidowitz My four-page discussion of gay porn searches and their implications adapts findings from Stephens-Davidowitz’s piece “How Many American Men Are Gay?” New York Times, December 7, 2013. Both the Google Trends data I cite and its extension to Nate Silver’s findings and to Gallup’s state-by-state numbers are based on that article. Silver’s original piece is “How Opinion on Same-Sex Marriage Is Changing, and What It Means,” from his New York Times fivethirtyeight blog, fivethirtyeight.blogs.nytimes.com/2013/03/26/how-opinion-on-same-sex-marriage-is-changing-and-what-it-means/.
Gallup’s numbers are from Gary J. Gates and Frank Newport, “LGBT Percentage Highest in D.C., Lowest in North Dakota,” gallup.com/poll/160517/lgbt-percentage-highest-lowest-north-dakota.aspx.
so does mobility data from Facebook In his article, Stephens-Davidowitz also extended his research into publicly available Facebook profile data.
often attributed to Thoreau The quote itself is a combination of a passage in Thoreau’s Walden with two lines of Oliver Wendell Holmes’s poem “The Voiceless.” See The Walden Woods Project: walden.org/Library/Quotations/The_Henry_D._Thoreau_Mis-Quotation_Page.
The old economic “misery index” is See Wikipedia’s “Misery index (economics).” Arthur Okun suggested the original formulation.
“Respondents who identified” See Mackey Friedman, “Considerable Gender, Racial and Sexuality Differences Exist in Attitudes Toward Bisexuality,” ScienceDaily, November 5, 2013, sciencedaily.com/releases/2013/11/131105081521.htm.
Gerulf Rieger of the University of Essex I reference a pair of papers by Professor Rieger and his team: Gerulf Rieger, Meredith L. Chivers, and J. Michael Bailey, “Sexual Arousal Patterns of Bisexual Men,” Psychological Science 16, no. 8 (2005): 579–84, and its successor, Gerulf Reiger et al., “Male Bisexual Arousal: A Matter of Curiosity?,” Biological Psychology 94, no. 3 (2013): 479–89.
Ellyn Ruthstrom See David Tuller, “No Surprise for Bisexual Men: Report Indicates They Exist,” New York Times, August 22, 2011, and Meredith Melnick, “Scientific Study Finds That Bisexuality Really Exists
,” Time, August 23, 2011, healthland.time.com/2011/08/23/scientific-study-finds-that-bisexuality-really-exists/.
On Facebook 58 percent See Chris Taylor, “Fake Facebook Users Likely to Be Popular Bisexual College Women,” Mashable, February 3, 2012, mashable.com/2012/02/03/fake-facebook-users-bisexual-college-women/.
Though people have been gay forever See Wikipedia’s “Timeline of LGBT history” and “Coming out” entries. The idea of self-disclosure (that is, coming out) as an act of empowerment was originated by Karl Heinrich Ulrichs.
Chapter 12: Know Your Place
The United States and the USSR split Korea I was generally familiar with this process, mostly from American Caesar, but this incredible anecdote is mentioned on the “Division of Korea” Wikipedia entry, which cites Don Oberdorfer’s book The Two Koreas (New York: Basic Books, 2001) as the original source. I confirmed the anecdote via a search on the book’s text on Google Books: books.google.com/books/about/The_Two_Koreas.html?id=yJZKpYXh2SAC.
Here you see a plot This map, like all the full US maps in this chapter, and the Reddit plot, was made by James Dowdell. This one was made using a standard Voronoi partition of the United States, which each Craigslist market serving as the “capital” of a “state” (called “seeds” and “cells”). Though the plot looks complex, it’s actually very elegant: the segments are all the points equidistant to the two nearest seeds. I’ve seen various other versions of this same plot. My version was inspired by one made by IDV Solutions and posted by “john.nelson” on their UX blog: uxblog.idvsolutions.com/2011/07/chalkboard-maps-united-states-of.html.
venue of longing is Walmart This is the same Voronoi plot, but combined with the by-state data from Dorothy Gambrell’s “Missed Connections” map, published in Psychology Today. The cells are coded by the top missed-connection result for the state where their seed lies. You can see the original map here: psychologytoday.com/blog/brainstorm/201302/missed-connections-0.
I transported the data to the previous Voronoi partition in order to maintain consistency with the previous Craigslist map.
Years ago, an enterprising hacker The hacker is Pete Warden, and his post is “How to Split Up the US,” which you can find here: petewarden.com/2010/02/06/how-to-split-up-the-us/. As Warden notes in a later post, “Why You Should Never Trust a Data Scientist,” his grouping of the United States into the seven new zones is arbitrary—the data science version of “for entertainment purposes only.” I reference them here in that spirit.
Matthew Zook, a geographer Professor Zook and his team maintain a fantastic geography blog called Floating Sheep, and that blog was my primary source for his work: floatingsheep.org.
The earthquake discussion and the map are drawn from “Mapping the Eastern Kentucky Earthquake” posted on the Floating Sheep blog by Taylor Shelton. My image is a reproduction of the original, simplified for print: floatingsheep.org/2012/11/mapping-eastern-kentucky-earthquake.html.
The DOLLY team is Matthew Zook, Mark Graham, Taylor Shelton, Monica Stephens, and Ate Poorthuis. Poorthuis narrates the Sint Maarten walkthrough, which can be found here: www.youtube.com/watch?v=pD9HWAaQGUA.
My discussion of the student riot is drawn from the paper “Beyond the Geotag: Situating ‘Big Data’ and Leveraging the Potential of the Geoweb,” by Jeremy W. Crampton et al., Cartography and Geographic Information Science 40, no. 2 (2013): 130–39.
Below is a plot of gay porn downloads IP address does not pinpoint any one person (or, more precisely, a computer address) to their exact location, only to a range of about ten to fifty miles. It is roughly the same technology used by, say, weather.com, to guess at what city’s weather to show you by default before you tell it a zip code. It only tells the general area from which a computer is accessing the Internet. From this research, we know nothing about the computers themselves other than what porn they were downloading; and we know absolutely nothing about who was actually using the computer, or in some cases, if there was even a person involved at all.
a forty-year-old woman in the Bay Area See “I’m Just Gonna Throw This Out There. Any Redditors in the SF Bay Area Have a Empty Spot at Their Table for a Lonely Thanksgiving Orphan?” posted by user “MeMyselfOhMy” on Reddit: reddit.com/r/AskReddit/comments/ebhh1/.
topics that you’ll only find on Reddit The example posts mentioned were all on the front page of their respective subreddits on January 30, 2013.
Anderson’s main topics are nationalism Showing the flexibility of his theory, many of Anderson’s ideas on nationhood are surprisingly applicable to online communities. He describes nations as “both inherently limited and sovereign” and “conceived as a deep, horizontal comradeship.” And especially applicable to the Internet is this passage: “This new synchronic novelty could arise historically only when substantial groups of people were in a position to think of themselves as living lives parallel to those of other substantial groups of people—if never meeting, yet certainly proceeding along the same trajectory.” Benedict Anderson, Imagined Communities (London: Verso, 1983), 6, 191–92.
a worldwide look at modern large-scale movements I obtained permission from the Facebook researchers Aude Hofleitner, Ta Virot Chiraphadhanakul, and Bogdan State to reproduce their map and discuss their results. They asked that I include a more robust explanation of “coordinated migration” and of their study. Here are their words:
In a coordinated migration, a significant proportion of the population of a city has migrated, as a group, to a different city. More specifically, a flow of population from city A (hometown) to another city B (current city) is considered a coordinated migration if, among the cities in which people from hometown A currently live, city B is the city with the largest number of individuals with current city B, and hometown A. There are numerous migrations to, from, and within the United States but they do not exhibit this coordinated property because there is no overly dominant attractive city and people move to different areas. This map displays chunks of the small towns and villages of Southeast Asia relocating en masse, in a coordinated fashion, to the urban centers.
For more information and the full study, please refer to the Facebook Data Science post on Coordinated Migration: www.facebook.com/notes/facebook-data-science/coordinated-migration/10151930946453859.
As you’ll see when you visit the link, in reproducing their work, I modified their original map by removing the labels and focusing on a smaller part of the region, to make the map more readable in print. Thank you to Mike Develin, also at Facebook, for helping facilitate permission for this reproduction. All Facebook Data Science work is done on anonymized and aggregated data.
Chapter 13: Our Brand Could Be Your Life
But what they don’t tell you See Clare Baker, “Behind the Red Triangle: The Bass Pale Ale Brand and Logo” Logoworks.com, November 8, 2013, logoworks.com/blog/bass-pale-ale-brand-and-logo/.
Archaeologists have unearthed My discussion of branding in ancient times is based on David Wengrow, “Prehistories of Commodity Branding,” Current Anthropology 49, no. 1 (2008): 7–34, and Gary Richardson, “Brand Names Before the Industrial Revolution,” NBER Working Paper No. 13930, National Bureau of Economic Research, Cambridge, MA, 2008. http://papers.nber.org/paper/w10411.
In 1997, Tom Peters See “The Brand Called You” by Tom Peters, published in Fast Company, August/September 1997, fastcompany.com/28905/brand-called-you.
still read in marketing classes See “What a great article. I was given this to read for a class of mine, and it is written brilliantly. Great insight and information on branding. Thanks!!” a comment by user “Morgan” on Peter’s article on Fastcompany.com.
a man named Peter Montoya Montoya’s first work on the topic was titled The Brand Called You: The Ultimate Brand-Building and Business Development Handbook to Transform Anyone into an Indispensable Personal Brand, by Peter Montoya and Tim Vandehey (self-published, 2003). This was then republished as The Brand Called You: Make Your Business Stand Out in a Crowded Marketplace
(New York: McGraw-Hill, 2008), which according to Amazon was an “international bestseller.” A PDF of the first chapter is hosted here: petermontoya.com/pdfs/tbcy-chapter1.pdf. Montoya’s personal site redirects to marketinglibrary.net, where you can book him for speaking engagements.
You can see the birth of the idea For this chart, I subtracted the long-standing idiom of “personal brand of” (as in “personal brand of leadership”) from the results for “personal brand” to isolate the self-marketing phenomenon.
Dale Carnegie I relied on Wikipedia’s “Dale Carnegie” entry for basic details on his life.
For every kid who tweets herself The two incidents I allude to here are Bernie Zak’s campaign to get into UCLA, as detailed in Brock Parker, “Brookline Student Lobbies UCLA on Twitter” Boston Globe, May 7, 2013, and Rob Meyer’s hiring by the Atlantic Monthly, as described in Alexis C. Madrigal, “How to Actually Get a Job on Twitter,” Atlantic Monthly, July 31, 2013.
See also Jason Fagone, “The Construction of a Twitter Aesthetic,” The New Yorker, February 12, 2014, newyorker.com/online/blogs/culture/2014/02/the-construction-of-a-twitter-aesthetic.html.
the different way African Americans tend My discussion of Black Twitter drew on the following sources:
Choire Sicha, “What Were Black People Talking About on Twitter Last Night?” The Awl, November 11, 2009, theawl.com/2009/11/what-were-black-people-talking-about-on-twitter-last-night.
Farhad Manjoo, “How Black People Use Twitter,” Slate, August 10, 2010, slate.com/articles/technology/technology/2010/08/how_black_people_use_twitter.html. A counterpoint to Manjoo’s piece is “Why ‘They’ Don’t Understand What Black People Do on Twitter” by Dr. Goddess, on blogspot. Goddess especially objects to the portrayal of blacks on Twitter as a “monolith”—the word appears twice in the post, and I echo it in my discussion. See drgoddess.blogspot.com/2010/08/why-they-dont-understand-what-black.html.
Dataclysm: Who We Are (When We Think No One's Looking) Page 22