by Marc Goodman
But what else might Google or any other company that had access to your pattern of life be able to determine? Let’s say, for example, your mobile phone was on the nightstand in the same home as your wife’s telephone six nights a week. From these data, it would be logical to conclude that the owners of the two cell phones lived together and were likely sleeping together. But what if one night a week your mobile phone was on a nightstand next to another woman’s mobile phone? What might that suggest to Google or others about your fidelity? An analysis of your locational data and that of the phones (and apps) around you is an excellent approximation of the strengths and bonds of your personal and professional networks. When your data exhaust patterns are studied over time, many more revelations about your life become possible. For example, researchers in the U.K. studied the past whereabouts of mobile phone users and using basic data analysis techniques were able to determine to within twenty meters of accuracy where a mobile phone user would be at the same time twenty-four hours later, a very useful tool for both advertisers and stalkers. Your phone today knows not only where you’ve been but also where you are going.
Analysis of your social network and its members can also be highly revealing of your life, politics, and even sexual orientation, as demonstrated in a study carried out at MIT. In an analysis known as Gaydar, researchers studied the Facebook profiles of fifteen hundred students at the university, including those whose profile sexual orientation was either blank or listed as heterosexual. Based on prior research that showed gay men have more friends who are also gay (not surprising), the MIT investigators had a valuable data point to review the friend associations of their fifteen hundred students. As a result, researchers were able to predict with 78 percent accuracy whether or not a student was gay. At least ten individuals who had not previously identified as gay were flagged by the researchers’ algorithm and confirmed via in-person interviews with the students. While these findings might not be troubling in liberal Cambridge, Massachusetts, they could prove problematic in the seventy-six countries where homosexuality remains illegal, such as Sudan, Iran, Yemen, Nigeria, and Saudi Arabia, where such an “offense” is punished by death. A study of fifty-eight thousand Facebook users published by the National Academy of Sciences demonstrated that by merely studying their Likes, one could determine intimate details and personality traits with surprising accuracy. The rigorous study carried out in conjunction with the University of Cambridge predicted whether users had a high or low IQ, were emotionally stable, or came from a broken home. The challenge with the data we are leaking is that, as has been shown numerous times, others can pick up our digital bread crumbs and interpret them without our knowledge in ways that can cause us harm.
But I’ve Got Nothing to Hide
In December 2009, when CNBC’s Maria Bartiromo asked Google’s own CEO, Eric Schmidt, about privacy concerns resulting from Google’s increasing tracking of consumers, Schmidt famously replied, “If you have something that you don’t want anybody to know, maybe you shouldn’t be doing it in the first place.” Schmidt, and others, dismiss privacy concerns by saying that if you haven’t done anything wrong, you should not be afraid of people (corporations, governments, or your neighbors) knowing what you are doing.
This sentiment has been echoed by Facebook’s CEO, Mark Zuckerberg, who has argued that “privacy is no longer the social norm.” While privacy may no longer be the norm—at least for the general public—in his own life, Mr. Zuckerberg seems to treasure privacy quite a bit. In late 2013, it was revealed that the Facebook CEO spent $30 million to buy the four homes surrounding his own property in order to ensure his privacy would remain free from intrusion or disturbance.
Facebook’s chief operating officer, Sheryl Sandberg, too has suggested that your assertion of any privacy rights is in contrast with “true authenticity.” Sandberg notes that “expressing authentic identity will become even more pervasive in the coming years … And yes, this shift to authenticity will take getting used to and it will elicit cries of lost privacy.” Convenient for Schmidt, Zuckerberg, and Sandberg that these “naturally occurring shifts” in social norms are tied to their personal and professional bottom lines, which directly benefit from monetizing you and the mountains of information you are leaking to the fullest extent possible, as a result of their highly one-sided ToS.
But “I have nothing to hide” is absolutely the wrong way to think about our new dataveillance society. It is a false dichotomy of choice: either we accept total surveillance, or we are criminals worthy of suspicion. If proponents of the “nothing to hide” argument meant what they said, then they would logically not object to our filming them having sex with their spouses, publishing their tax returns online, and projecting video of their toilet use on the Jumbotron of a crowded stadium, right? After all, they have nothing to hide. The fact is that each of us has private special moments in our lives, made exceptional by limiting with whom we share such intimacies.
For those who believe the fallacy of nothing to hide, perhaps a lesson in something to fear might be appropriate, for all of us have details in our lives we would rather not share. For example, Google Voice, Skype, your mobile phone carrier, and any number of government agencies have records of anyone who has ever phoned an abortion clinic, a suicide hotline, or a local chapter of Alcoholics Anonymous. Data aggregators know who has searched for “slutty cheerleaders,” “Viagra,” or “Prozac” across any of their electronic devices. While all these behaviors may be perfectly legal, no doubt they have repercussions in our society should the information come to light.
Given that Google and Facebook alone have hundreds of petabytes of data on their users stored in perpetuity, perhaps it is more worthwhile to question not what any of us may have to hide today but what we might wish to keep private in the future—and if Facebook existed in 1950, how might history judge an off-color joke today? What future crime might you be convicted of without ever knowing you were in fact violating the law? Did you drive across the border to New Jersey or Delaware to save on taxes when buying back-to-school clothes for the kids? Your cell-phone and credit card receipts document your tax evasion. That photograph on Twitpic of the family dinner showing your twenty-year-old son drinking a glass of wine—evidence of alcohol furnished to a minor. As the computer security researcher Moxie Marlinspike pointed out, there are “27,000 pages of federal statutes” in the United States and another “10,000 administrative regulations. You probably do have something to hide, you just don’t know it yet.”
Privacy Risks and Other Unpleasant Surprises
As Wired’s Mat Honan and the grieving father Mike Seay discovered, our personal data can end up in the hands of those who we assuredly would prefer did not have access to such information. The combination of our social data commingled with public databases, cookies, beacons, and locations can lead to a series of unintended and even harmful consequences. Put another way, your data are increasingly promiscuous. They flow from one system to the next, from database to database, obscured and distributed across cloud-based networks around the world, shared, processed, and sold. But as we have learned from the real world, promiscuity can often lead to social diseases and other unintended consequences.
In an incident not too dissimilar from the OfficeMax debacle, a Minneapolis man learned his daughter was pregnant, not from her, but from his local Target store. The discovery was made when Target began sending the fifteen-year-old girl coupons for items that did not meet her father’s approval. Armed with the coupons and a letter addressed to his daughter, the father furiously marched into Target and began berating the store manager. “My daughter got this in the mail!… She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?” A few days later, the man phoned the store to apologize, noting, “There have been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology.” But how in the world did Target know the gir
l was pregnant? Through its pregnancy prediction algorithm of course, which aggregated a customer’s entire purchase history with demographic statistics purchased from data brokers. Target reasoned that if it could find those women before their second trimester of pregnancy and hook them as customers, it would receive the lion’s share of their purchases, not just for baby wipes, cribs, and diapers, but for toys and clothes as the infants aged through adolescence. After an in-depth study by Target’s statisticians, Target noticed that women in the baby registry were “buying larger quantities of unscented lotion at the beginning of their second trimester in addition to vitamin supplements such as calcium, magnesium and zinc.” In total, Target was able to identify twenty-five products that, when analyzed together, allowed it to assign each shopper a “pregnancy prediction score.” When this model ran against the millions of women in Target’s customer databases, thousands and thousands of pregnant women were identified before any other companies had made the connection. Target and the company’s marketers were ecstatic with this discovery. Less enthralled was the father of that fifteen-year-old girl in Minneapolis who would learn of his forthcoming grandchild via a corporate coupon mailer. Given Target’s 2013 hacking, in which the financial data of 110 million of its customers were compromised, what guarantees do consumers have that the vast additional troves of highly personal data in Target’s vaults also won’t be stolen? Can customers trust Target or any other large retailer with the volumes of data it collects, stores, and analyzes? Likely not, and therein lies the problem.
The risks to our personal data come not just from hackers but, as more and more people are finding out, from big-data analytics as well. Previously, many of the data aggregated were held in limbo, as our collection abilities well surpassed our ability to make sense of all that had been collected. That is now changing, and the data we leak on social media sites like Facebook are showing up in unexpected ways. One such person affected was Bobbi Duncan, a twenty-two-year-old lesbian student at the University of Texas at Austin. She came from a strict Christian family and worked hard to keep her sexual orientation from her parents. As she began to understand herself better, she joined a number of student groups on campus, including the Queer Chorus, as a means of meeting other gay and lesbian students at her school. When she joined the organization, the president of the Queer Chorus welcomed Bobbi by adding her to the group’s Facebook discussion page, which he was able to do without her permission (there is no setting in Facebook to prevent a third party from adding you to his or her group). When he did so, Facebook sent an automatic system notification to Bobbi’s entire list of friends—including her father—notifying them that she had joined the Queer Chorus. Two days after receiving the notification, Bobbi’s father wrote a reply on his Facebook page: “To all you queers. Go back to your holes and wait for GOD. Hell awaits you perverts. Good luck singing there.” Facebook outed a closeted lesbian and caused her parents to disown her. In response to the irreparable harm she suffered, Bobbi was unequivocal in her stance: “I blame Facebook … It shouldn’t be somebody else’s choice what people see of me.”
When you are the product of Internet and social media companies, the challenge you face is that data you provide in one context can be used in unexpected ways in another, with notable consequences. Such is the case with the highly popular “free” dating site OkCupid. Users seeking dates are asked to fill out questionnaires on the site, and most presume, wrongly, that the data they provide remain exclusively within the OkCupid system, used solely for the purposes of finding a suitable match for a date. Yeah, right! To allegedly get the best matches, OkCupid asks users a bevy of deeply personal questions about their number of past sexual partners, whether they support abortion rights, whether they own a firearm, if they would sleep with somebody on a first date, if they smoke cigarettes, and if they drink alcohol frequently or use illegal drugs (including which drugs and how often). At least that’s what users see on their screens when completing their profiles …
What they don’t see is the fifty or so companies with whom OkCupid shares this information, including ad firms, data brokers, and marketers. To understand the extent of the data leakage, Ashkan Soltani, a digital privacy specialist who used to work at the Federal Trade Commission, created a dummy account on OkCupid. Using several free privacy browser plug-in tools, including Collusion and mitmproxy, Soltani was able to observe that the answers provided by OkCupid’s users were parsed and forwarded to dozens of data brokers in real time. When Soltani completed his test OkCupid profile and clicked that he frequently used drugs, he was able to observe a cookie file that shared his purported drug usage with a data broker known as Lotame. You think you’re just filling out a confidential profile for a “free” online dating service; in reality, you’ve been had and are instead detailing information that you would never otherwise share with any marketing company or data broker. It’s a huge ruse: dating is just the “cover story” for massive data extraction. In an ensuing investigation into Soltani’s research by NPR, both OkCupid and Lotame declined to comment on the matter. Such is the state of affairs in the world’s unregulated data broker industry. Who else might be willing to pay for OkCupid’s archive of your drug use and sexual history? An insurance company, prospective employers, or perhaps the government after that DUI incident you had last June?
Even when you have “nothing to hide,” your continually tracked social network graph and location can come back to bite you and even affect your financial status. A handful of tech start-ups have begun to use the quality of the friends in your social network to determine whether or not you are a good credit risk. One such company, Lenddo, determines if you are friends with somebody who is behind on paying back her loans and how often you interact with that person. As a result, your creditworthiness can drop because of whom you’ve friended on Facebook. If your pals on Google+ and Pinterest are deadbeats, chances are you may be too (according to the big-data gods). Facebook may become the next FICO credit-scoring agency as financial data aggregators take full advantage of your social data feeds to rate your financial stability. So as your mom used to warn you, choose your friends wisely.
The fact is that we are all contributing to our own digital pollution. Just as in the twentieth century people thought nothing of pouring industrial waste into a river or tossing garbage onto the street, so too do we fail to comprehend the long-term consequences of our digital actions today. The current state of affairs stems from our fundamental misunderstanding of the bargain we have made for so-called free online services.
Opening Pandora’s Virtual Box
People share their most intimate thoughts and secrets online as if they were having a private conversation with a trusted friend. If only the legal system agreed. In the United States, social networks are considered to be public spaces, not private ones, and any information shared there is covered under the so-called third-party doctrine, which in plain English means that users have no reasonable expectation of privacy in the data their service providers (cell-phone companies, ISPs, cable companies, and Web sites) collect on them.
This noted exception to the Fourth Amendment’s prohibition on unreasonable search and seizure means that any data you post online in any format (regardless of your privacy settings) or any data that are collected by the third parties with whom you have an agreed-upon business relationship are not considered private. Nor does it meet the constitutional standard of “private papers” but rather forms part of the business records of the institution in possession of the data. Shocking though this may be, it is the current state of jurisprudence in the United States, with noted and profound impact on all citizens both online and off. As a result, your data leak to places you would never want them to, and you cannot claw them back, no matter how hard you try.
Accordingly, the word “Facebook” appeared in a full one-third of divorce filings in 2011. All of this provides excellent fodder for the 81 percent of divorce attorneys who admit searching social media sites for evi
dence that can be used against their clients’ spouses. For instance, all the data shared on Facebook and Twitter and all the cell-phone call records and GPS locational data that neatly recorded whose cell phone was next to whose and when become fair game in the battle royal that can be divorce proceedings. The pictures innocently taken of you at all those parties over the years, blurry-eyed with drink in hand, now become evidence of unfit parenting, a nugget of gold for opposing counsel during cross-examination. That profile you created on OkCupid indicating you were single (which was shared via your browser’s cookies with fifty marketing companies)—perfectly admissible when your wife brings it up during your divorce hearing. When a husband complains that his wife is an inattentive and an unfit mother, he has new powerful evidence to support his claims in the form of subpoenaed records documenting the hundreds of hours she logged on FarmVille and in World of Warcraft, times coinciding with all of her children’s soccer and baseball games that she missed. But the data we’re leaking affect us not only during divorce but in our jobs as well.
A survey conducted by Microsoft on the matter of online reputation found that 70 percent of human resource professionals had rejected a job candidate based on information they had uncovered during an online search. Worse, some employers are now demanding the social media passwords of job applicants and even current employees. Want to work for the Norman, Oklahoma, Police Department, the Maryland Department of Public Safety and Correctional Services, the city of Bozeman, Montana, or the Virginia State Police? Applicants in all of these jurisdictions were required to turn over their Facebook and other social media passwords as part of their so-called routine background checks. This includes providing prospective employers access to all your messages, photographs, and timelines, private and public, on Facebook, Google, Yahoo!, YouTube, and Instagram. While some states, including California, have barred such practices against employees, there is no federal law banning such practices, and it remains legal in 80 percent of American states, and so the data leak.