Book Read Free

Data and Goliath

Page 6

by Bruce Schneier


  Companies quickly realized that they could set their own cookies on pages belonging to other sites—with their permission and by paying for the privilege—and the third-party cookie was born. Enterprises like DoubleClick (purchased by Google in 2007) started tracking web users across many different sites. This is when ads started following you around the web. Research a particular car or vacation destination or medical condition, and for weeks you’ll see ads for that car or city or a related pharmaceutical on every commercial Internet site you visit.

  This has evolved into a shockingly extensive, robust, and profitable surveillance architecture. You are being tracked pretty much everywhere you go on the Internet, by many companies and data brokers: ten different companies on one site, a dozen on another. Facebook tracks you on every site with a Facebook Like button (whether you’re logged in to Facebook or not), and Google tracks you on every site that has a Google Plus +1 button or that simply uses Google Analytics to monitor its own web traffic.

  Most of the companies tracking you have names you’ve never heard of: Rubicon Project, AdSonar, Quantcast, Pulse 260, Undertone, Traffic Marketplace. If you want to see who’s tracking you, install one of the browser plugins that let you monitor cookies. I guarantee you will be startled. One reporter discovered that 105 different companies tracked his Internet use during one 36-hour period. In 2010, a seemingly innocuous site like Dictionary.com installed over 200 tracking cookies on your browser when you visited.

  It’s no different on your smartphone. The apps there track you as well. They track your location, and sometimes download your address book, calendar, bookmarks, and search history. In 2013, the rapper Jay-Z and Samsung teamed up to offer people who downloaded an app the ability to hear the new Jay-Z album before release. The app required the ability to view all accounts on the phone, track the phone’s location, and track who the user was talking to on the phone. And the Angry Birds game even collects location data when you’re not playing.

  Broadband companies like Comcast also conduct surveillance on their users. These days they’re mostly monitoring to see whether you illegally download copyrighted songs and videos, but other applications aren’t far behind. Verizon, Microsoft, and others are working on a set-top box that can monitor what’s going on in the room, and serve ads based on that information.

  It’s less Big Brother, and more hundreds of tattletale little brothers.

  Today, Internet surveillance is far more insistent than cookies. In fact, there’s a minor arms race going on. Your browser—yes, even Google Chrome—has extensive controls to block or delete cookies, and many people enable those features. DoNotTrackMe is one of the most popular browser plug-ins. The Internet surveillance industry has responded with “flash cookies”—basically, cookie-like files that are stored with Adobe’s Flash player and remain when browsers delete their cookies. To block those, you can install FlashBlock. But there are other ways to uniquely track you, with esoteric names like evercookies, canvas fingerprinting, and cookie synching. It’s not just marketers; in 2014, researchers found that the White House website used evercookies, in violation of its own privacy policy. I’ll give some advice about blocking web surveillance in Chapter 15.

  Cookies are inherently anonymous, but companies are increasingly able to correlate them with other information that positively identifies us. You identify yourself willingly to lots of Internet services. Often you do this with only a username, but increasingly usernames can be tied to your real name. Google tried to compel this with its “real name policy,” which mandated that users register for Google Plus with their legal names, until it rescinded that policy in 2014. Facebook pretty much demands real names. Anytime you use your credit card number to buy something, your real identity is tied to any cookies set by companies involved in that transaction. And any browsing you do on your smartphone is tied to you as the phone’s owner, although the website might not know it.

  FREE AND CONVENIENT

  Surveillance is the business model of the Internet for two primary reasons: people like free, and people like convenient. The truth is, though, that people aren’t given much of a choice. It’s either surveillance or nothing, and the surveillance is conveniently invisible so you don’t have to think about it. And it’s all possible because US law has failed to keep up with changes in business practices.

  Before 1993, the Internet was entirely noncommercial, and free became the online norm. When commercial services first hit the Internet, there was a lot of talk about how to charge for them. It quickly became clear that, except for a few isolated circumstances like investment and porn websites, people weren’t willing to pay even a small amount for access. Much like the business model for television, advertising was the only revenue model that made sense, and surveillance has made that advertising more profitable. Websites can charge higher prices for personally targeted advertising than they can for broadcast advertising. This is how we ended up with nominally free systems that collect and sell our data in exchange for services, then blast us with advertising.

  “Free” is a special price, and there has been all sorts of psychological research showing that people don’t act rationally around it. We overestimate the value of free. We consume more of something than we should when it’s free. We pressure others to consume it. Free warps our normal sense of cost vs. benefit, and people end up trading their personal data for less than its worth.

  This tendency to undervalue privacy is exacerbated by companies deliberately making sure that privacy is not salient to users. When you log on to Facebook, you don’t think about how much personal information you’re revealing to the company; you chat with your friends. When you wake up in the morning, you don’t think about how you’re going to allow a bunch of companies to track you throughout the day; you just put your cell phone in your pocket.

  The result is that Internet companies can improve their product offerings to their actual customers by reducing user privacy. Facebook has done it systematically over the years, regularly updating its privacy policy to obtain more access to your data and give you less privacy. Facebook has also changed its default settings so that more people can see your name, photo, wall posts, photos you post, Likes, and so on. Google has done much the same. In 2012, it announced a major change: Google would link its data about you from search, Gmail, YouTube (which Google owns), Google Plus, and so on into one large data set about you.

  Apple is somewhat of an exception here. The company exists to market consumer products, and although it could spy on iCloud users’ e-mail, text messages, calendar, address book, and photos, it does not. It uses iTunes purchase information only to suggest other songs and videos a user might want to buy. In late 2014, it started using this as a market differentiator.

  Convenience is the other reason we willingly give highly personal data to corporate interests, and put up with becoming objects of their surveillance. As I keep saying, surveillance-based services are useful and valuable. We like it when we can access our address book, calendar, photographs, documents, and everything else on any device we happen to be near. We like services like Siri and Google Now, which work best when they know tons about you. Social networking apps make it easier to hang out with our friends. Cell phone apps like Google Maps, Yelp, Weather, and Uber work better and faster when they know our location. Letting apps like Pocket or Instapaper know what we’re reading feels like a small price to pay for getting everything we want to read in one convenient place. We even like it when ads are targeted to exactly what we’re interested in. The benefits of surveillance in these and other applications are real, and significant.

  We especially don’t mind if a company collects our data and uses it within its own service to better serve us. This is why Amazon recommendations are rarely mentioned when people complain about corporate surveillance. Amazon constantly recommends things for you to buy based on the things you’ve bought and the things other people have bought. Amazon’s using your data in the same context it was collect
ed, and it’s completely transparent to the user. It’s very big business for Amazon, and people largely accept it. They start objecting, though, when their data is bought, sold, and used without their knowledge or consent.

  THE DATA BROKER INDUSTRY

  Customer surveillance is much older than the Internet. Before the Internet, there were four basic surveillance streams. The first flowed from companies keeping records on their customers. This was a manufacturing supply company knowing what its corporate customers order, and who does the ordering. This was Nordstrom remembering its customers’ sizes and the sorts of tailoring they like, and airlines and hotels keeping track of their frequent customers. Eventually this evolved into the databases that enable companies to track their sales leads all the way from initial inquiry to final purchase, and retail loyalty cards, which offer consumers discounts but whose real purpose is to track their purchases. Now lots of companies offer Customer Relationship Management, or CRM, systems to corporations of all sizes.

  The second traditional surveillance stream was direct marketing. Paper mail was the medium, and the goal was to provide companies with lists of people who wanted to receive the marketing mail and not waste postage on people who did not. This was necessarily coarse, based on things like demographics, magazine subscriptions, or customer lists from related enterprises.

  The third stream came from credit bureaus. These companies collected detailed credit information about people, and sold that information to banks trying to determine whether to give individuals loans and at what rates. This has always been a relatively expensive form of personal data collection, and only makes sense when lots of money is at stake: issuing credit cards, allowing someone to lease an apartment, and so on.

  The fourth stream was from government. It consisted of various public records: birth and death certificates, driver’s license records, voter registration records, various permits and licenses, and so on. Companies have increasingly been able to download, or purchase, this public data.

  Credit bureaus and direct marketing companies combined these four streams to become modern day data brokers like Acxiom. These companies buy your personal data from companies you do business with, combine it with other information about you, and sell it to companies that want to know more about you. And they’ve ridden the tides of computerization. The more data you produce, the more they collect and the more accurately they profile you.

  The breadth and depth of information that data brokers have is astonishing. They collect demographic information: names, addresses, telephone numbers, e-mail addresses, gender, age, marital status, presence and ages of children in household, education level, profession, income level, political affiliation, cars driven, and information about homes and other property. They collect lists of things you’ve purchased, when you’ve purchased them, and how you paid for them. They keep track of deaths, divorces, and diseases in your family. They collect everything about what you do on the Internet.

  Data brokers use your data to sort you into various marketable categories. Want lists of people who fall into the category of “potential inheritor” or “adult with senior parent,” or addresses of households with a “diabetic focus” or “senior needs”? Acxiom can provide you with that. InfoUSA has sold lists of “suffering seniors” and gullible seniors. In 2011, the data broker Teletrack sold lists of people who had applied for nontraditional credit products like payday loans to companies who wanted to target them for bad financial deals. In 2012, the broker Equifax sold lists of people who were late on their mortgage payments to a discount loan company. Because this was financial information, both brokers were fined by the FTC for their actions. Almost everything else is fair game.

  PERSONALIZED ADVERTISING

  We use systems that spy on us in exchange for services. It’s just the way the Internet works these days. If something is free, you’re not the customer; you’re the product. Or, as Al Gore said, “We have a stalker economy.”

  Advertising has always suffered from the problem that most people who see an advertisement don’t care about the product. A beer ad is wasted on someone who doesn’t drink beer. A car advertisement is largely wasted unless you are in the market for a car. But because it was impossible to target ads individually, companies did the best they could with the data they had. They segmented people geographically, and guessed which magazines and TV shows would best attract their potential customers. They tracked populations as a whole, or in large demographic groups. It was very inefficient. There’s a famous quote, most reliably traced to the retail magnate John Wanamaker: “I know half of my advertising is wasted. The trouble is, I don’t know which half.”

  Ubiquitous surveillance has the potential to change that. If you know exactly who wants to buy a lawn mower or is worried about erectile dysfunction, you can target your advertising to the right person at the right time, eliminating waste. (In fact, a national lawn care company uses aerial photography to better market its services.) And if you know the details about that potential customer—what sorts of arguments would be most persuasive, what sorts of images he finds most appealing—your advertising can be even more effective.

  This also works in political advertising, and is already changing the way political campaigns are waged. Obama used big data and individual marketing to great effect in both 2008 and 2012, and other candidates across parties are following suit. This data is used to target fund-raising efforts and individualized political messages, and ensure that you actually get to the polls on Election Day—assuming the database says that you’re voting for the correct candidate.

  A lot of commercial surveillance data is filled with errors, but this information can be valuable even if it isn’t very accurate. Even if you ended up showing your ad to the wrong people a third of the time, you could still have an effective advertising campaign. What’s important is not perfect targeting accuracy; it’s that the data is enormously better than before.

  For example, in 2013, researchers were able to determine the physical locations of people on Twitter by analyzing similarities with other Twitter users. Their accuracy rate wasn’t perfect—they were only able to predict a user’s city with 58% accuracy—but for plenty of commercial advertising that level of precision is good enough.

  Still, a lot of evidence suggests that surveillance-based advertising is oversold. There is value in showing people ads for things they want, especially at the exact moment they are considering making a purchase. This is what Google tries to do with Adwords, its service that places ads next to search results. It’s what all retailers try to do with “people who bought this also bought this” advertising. But these sorts of things are based on minimal surveillance.

  What’s unclear is how much more data helps. There is value in knowing broad personal details about people: they’re gay, they’re getting married, they’re thinking about taking a tropical vacation, they have a certain income level. And while it might be very valuable for a car company to know that you’re interested in an SUV and not a convertible, it’s only marginally more valuable to know that you prefer the blue one to the green one. And it’s less valuable to know that you have two kids, one of whom still needs a car seat. Or that one of the kids died in a car crash. Yes, a dealer would push the larger SUV in the first instance and tout safety in the second, but there are diminishing returns. And advertising that’s too targeted feels creepy and risks turning customers off.

  There’s a concept from robotics that’s useful here. We tend to be comfortable with robots that obviously look like robots, and with robots that appear fully human. But we’re very uncomfortable with robots that look a lot like people but don’t appear fully human. Japanese roboticist Masahiro Mori called this phenomenon the “uncanny valley.” Technology critic Sara M. Watson suggested that there’s a similar phenomenon in advertising. People are okay with sloppily personalized advertising and with seamless and subtle personalized advertising, but are creeped out when they see enough to realize they’re being manipul
ated or when it doesn’t match their sense of themselves.

  This is all going to change over time, as we become used to personalized advertising. The definition of “creepy” is relative and fluid, and depends a lot on our familiarity with the technologies in question. Right now, ads that follow us around the Internet feel creepy. Creepy is also self-correcting. Google has a long and complex policy of impermissible search ads, because users found some types of advertising too creepy. Other companies are letting people click on a link to find out why they were targeted for a particular ad, hoping that that will make them more comfortable with the process.

  On the other hand, some companies just hide it better. After the story ran about Target figuring out that the teenager was pregnant, the company changed the way it sent targeted advertising to people. It didn’t stop advertising to women it thought were pregnant, but it embedded those targeted ads among more general ads. Recipients of these mailings didn’t feel targeted, so they were less creeped out by the advertisements.

  Meanwhile, the prevalence of advertising in our environment is making individual ads less valuable for two reasons. First, as advertising saturates our world, the value of each individual ad falls. This is because the total amount of money we have to spend doesn’t change. For example, all automobile manufacturers are fighting for the profit from the one car you will buy. If you see ten times the ads, each one is only worth one tenth the price, because in the end you’re only going to buy one car.

 

‹ Prev