The Numerati

Home > Other > The Numerati > Page 22
The Numerati Page 22

by Stephen Baker


  75 As Robert O’Harrow Jr. writes. See No Place to Hide, by Robert O’Harrow Jr., Free Press, 2005.

  Mike Henry, Clinton’s deputy campaign manager, left the race on February 13, 2008, following Clinton’s losses to Senator Barack Obama in Virginia, Maryland, and the District of Columbia.

  4. BLOGGER

  106 Companies and governments alike are poring over. This is happening in countless ways. Consider Michael Cavaretta. He runs a math shop at Ford. He and his team are attempting to mine the company’s vast collection of warranty claims. The big challenge is to reduce millions of documents, some of them handwritten, into math. But first the machines must figure out the writers. What do thousands of mechanics and customer service reps around the world mean when they write phrases like “squeak and squeal,” “shimmy and shake”? Are those pairs of words synonyms? Should they go into the same bucket? Do the meanings of these words vary by region? Cavaretta told me that one mechanic wrote that a car was “squealing like the pig Bubba stuck.” How does a computer make sense of that? Cavaretta’s team extracts all the knowledge it can from this vast collection before clustering the data and using statistical analysis to find patterns of problems in the cars.

  119 A blog about deodorants in Iraq. Stephen Baker, Blogspotting.net, “Captive Advertising Audience at 30,000 Feet,” http://www.businessweek.com/the_thread/blogspotting/archives/2007/01/captive_adverti.html.

  5. TERRORIST

  124 USA Today reported. “NSA Has Massive Database of Americans’ Phone Calls,” USA Today, May 11, 2006.

  125 There’s a lack of historical record. This is a problem for NASA as well. David Danks, a philosophy professor at Carnegie Mellon University, told me that NASA processes data from 40,000 different sensors on the space shuttles, much of it coming in numerous times per second. This provides sufficient data to create detailed simulations of launches. And yet during the first quarter-century of shuttle flights, there have been only two disasters. “We have a sample size of two,” he said. This makes it difficult to pick out patterns of data that point to problems.

  126 Unexpected earth-shaking events. Nassim Nicholas Taleb, The Black Swan: The Impact of the Highly Improbable, Random House, 2007.

  Jerry Friedman, a statistics professor at Stanford. See The Mathematical Sciences’ Role in Homeland Security: Proceedings of a Workshop, National Academies Press, 2004.

  131 Jeff Jonas, like many others. Jonas writes at length about security and privacy challenges surrounding data on his blog, http://www.jeffjonas.typepad.com/.

  143 As many as 300 cameras. “Watching You Watching Me,” New Statesman, Oct. 2, 2006.

  The Chinese government announced plans. “China Enacting a High-Tech Plan to Track People,” New York Times, Aug. 12, 2007.

  6. PATIENT

  160 “There are a zillion people following biology” For that same reason, I decided not to focus the medicine chapter on what the Numerati are doing in the vast field of genetics. But I did research the subject. One of my ideas was to figure out the genetic odds that, like my father, I would develop glaucoma and macular degeneration and eventually go blind late in my life. This question led me to the University of Iowa, where a personable doctor named Edwin Stone has built a world-class eye research operation, including the Carver Family Center for Macular Degeneration. I learned there about an experiment to decode the entire genome of a rat’s eye, which is similar—despite its beady appearance—to our own. The job for the Numerati studying the rat gene is not to find single genes that create blindness. Those are rare. Instead, the challenge is to untangle tens of millions of relationships among the genes and to map the paths of power and influence within the eye. The secrets to blindness are not found in the structure of the genome but instead in the behavior of its components. It’s like a society.

  The analysis, of course, is statistical. And as I learned about it, I began to see that it’s very much like the work that goes on at Tacoda. Just as Dave Morgan was searching for the behavioral patterns of romantic-movie lovers, genetic researchers have to parse the behavior of the influential genes. What activates them? Are there stimuli coming from other genes or proteins? Which ones? In both domains, advertising and genetics, the process involves sifting through massive sets of data, looking for patterns, weighing statistics, and using probability to distinguish between a cause and a coincidence. From the point of view of the Numerati, the microscopic forces within our bodies behave more like communities, or even markets, than like components of a machine.

  I’m sorry to report that I learned nothing about the chances that I would go blind, much less that genetic fixes were at hand. Instead, Dr. Stone prepared me for a gradual approach to battling inherited diseases: “A couple of years ago,” he told me, “we identified this gene called the fibulin 5. It’s responsible for 1.5 percent of age-related macular degeneration.” He made a tiny space between two fingers. “It’s this dinky little thing, right?” But the discovery, he said, gives researchers a look at the mechanism that causes macular degeneration. “This allows us then to do experiments that say, now why is that? Why is a tiny change in this gene causing people to get these accumulations under their retinas? . . . If we could understand that pathway,” he said, “maybe there are things we could do when somebody is 35 years old to knock that pathway down a little bit. Then instead of the average age of someone losing vision from macular degeneration being 67 or 71 or something, maybe it could be 87 or 91. We’d like it to be never. But from a population point of view, every three or four years that you could move that curve make a dramatic difference in the amount of blindness out there.”

  171 Provided you fork over your data. Hospitals that figure out how to make intelligent use of patient data are bound to rise to the top. This, as I learned on a visit to the Mayo Clinic in Rochester, Minnesota, has long been the case. I met with the clinic’s data expert, Dr. Christopher Chute, who told me about a crucial breakthrough. In the first years of the clinic, more than a century ago, he told me, the Mayo brothers ran their operation much like other big clinics. Say a patient came in with a sore shoulder. He was sent to the orthopedic specialist. But it turned out to be a heart problem! So off he went to the coronary specialist. He took some medicine there and broke out in hives. Next stop, dermatologist. Each of these three doctors had a separate record of the patient. Often they had to track down their colleagues to piece together the twists and turns of their patients’ cases.

  Enter the Mayo brothers’ partner, Henry Plummer. In 1907, he and his assistant, Mabel Root, devised a new system. Upon signing in, each patient received a dossier, to be carried from doctor to doctor. This way, each doctor could study the medical history of their patients from the first day they arrived at the clinic. When the patients checked out, their dossiers went into a big file. Plummer and Root put color codings on the dossiers for each type of disease and treatment. The result, said Chute, using language that sounds more Google than Mayo, “They had a paper database that was structured and searchable!” Through the years, they indexed the dossiers with ever finer detail. This enabled them to engage in what our generation would call analytics. They could look at every case of colon cancer or tonsillitis, and analyze which treatments were most effective and cost-efficient. “This was continuous quality improvement,” Chute said, referring to the industrial process Japanese automakers made famous decades later. They turned the practice of medicine from a boutique business of independent consultants into a modern business. “This place exploded out of the corn fields.” The challenge now, of course, is to come up with a similar breakthrough for medical data in the twenty-first century.

  172 In Britain, Norwich Union offers. “Norwich Union Buys Tracking Equipment for Pay-as-You-Go Motor Insurance,” Insurance Business Review, Oct. 6, 2005.

  7. LOVER

  195 94 percent of U.S. corporations. “The Art of the Online Résumé,” BusinessWeek, May 7, 2007.

  196 Software to record their movements and interactions. “Gadge
ts That Know Your Next Move,” Technology Review, Nov. 1, 2006.

  CONCLUSION

  215 “Garbage in, garbage out.” Not everyone agrees with the familiar thesis of garbage in, garbage out. Early in my research, I was talking about it with William Pulleyblank, IBM’s vice president in charge of business optimization and a former director of the company’s Deep Computing Institute. “Garbage in, garbage out isn’t correct anymore,” he said. “You haven’t got time to clean up your data. The real challenge is how you make something of value from ‘garbage.’” In other words, in a fast-moving business world, quick and dirty conclusions have a fighting chance to work. Slow and sure, by contrast, is often an oxymoron, because data may be out of date by the time it’s cleaned and vetted.

  Sources and Further Reading

  * * *

  Ayres, Ian. Supercrunchers: Why Thinking-by-Numbers Is the New Way to Be Smart. Bantam, 2007

  Barabasi, Albert-Laszlo. Linked. Plume/The Penguin Group, 2003

  Bardi, Jason. Socrates: The Calculus Wars. Thunder’s Mount Press, 2006

  Briggs, Rex, and Greg Stuart. What Sticks. Kaplan Publishing, 2006

  Brin, David. The Transparent Society. Basic Books, 1998

  Courant, Richard, and Herbert Robbins (revised by Ian Stewart). What Is Mathematics? Oxford University Press, 1996 (originally published in 1941)

  Dantzig, Tobias. Number: The Language of Science. Fourth edition. The Free Press, 1967

  Gleick, James. Isaac Newton. Vintage Books, 2003

  Hamm, Steve. Bangalore Tiger. McGraw-Hill, 2007

  Henshaw, John M. Does Measurement Measure Up? The Johns Hopkins University Press, 2006

  Morville, Peter. Ambient Findability: What We Find Changes Who We Become. O’Reilly Media, 2005

  O’Harrow, Robert Jr. No Place to Hide. Free Press, 2005

  Schultz, Don E., Stanley I. Tannenbaum, and Robert F. Lauterborn. The New Marketing Paradigm: Integrated Marketing Communications. NTC Business Books, 1994

  Sosnik, Douglas B., Matthew J. Dowd, and Ron Fournier. Applebee’s America. Simon & Schuster, 2006

  Stakutis, Chris, and John Webster. Inescapable Data: Harnessing the Power of Convergence. IBM Press, 2005

  Watts, Duncan J. Six Degrees: The Science of a Connected Age. Norton, 2003

  Whitehead, Alfred North. Introduction to Mathematics. Barnes & Noble Books, 2005 (originally published in 1911)

  Index

  * * *

  A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

  A

  Accenture (company), 42–44, 46–49, 58–64, 70

  Acxiom (company), 75–76

  AdSense (Google service), 117

  Advertisers, 202

  calculating rate of return for, 94, 95

  changes in methods of, 9–11, 14, 98, 100–116

  customer lists shared by, 62, 76, 91–92, 207

  and the Internet, 2–7, 15–16, 42, 98–122, 187, 194

  microtargeting by, 91, 205, 224n

  and retail stores, 47–66, 70, 125, 141, 183, 192

  selling our own data to, 205

  See also “Buckets”; Shoppers; “Tribes”

  African American voters, 226n

  Age (generations)

  distinguishing, through word analysis, 100–101, 108, 111

  on online dating questionnaire, 199–200

  Alamo Rent A Car, 1–3, 15–16, 27, 57

  Algorithms

  for analyzing patients, 158, 160, 181

  for analyzing shoppers, 57, 194–95

  for analyzing terrorists, 126–27, 129, 145

  for analyzing voters, 72, 86, 91

  for biological analysis, 56, 160

  dating services’ use of, 182, 184, 192, 194–95, 200, 205

  defined, 31–32, 222–23n

  as Numerati tool, 13, 14, 30–32, 39, 57–58, 205, 206–7

  Alhazmi, Nawaf, 131–32

  Allen, Paul, 159

  AllianceTech (company), 65

  Almihdhar, Khalid, 131–32

  Al Qaeda, 124, 131, 142, 146

  Alzheimer’s disease, 159, 161, 176, 177, 180

  Amazon.com, 14, 42, 45, 61, 162

  Andresen, Dan, 169–73, 198

  ANNA (privacy software), 152

  Anthropologists, 149–50, 154–55, 182, 184, 188

  AOL (company), 15–16

  Applebee’s America (Dowd), 68, 92, 225n

  Arnold, Douglas, 64

  Artificial intelligence, 107–8

  See also Computers; Machine learning

  ASWORG, 30–31

  AttentionTrust (company), 205

  B

  Baker, Mary Jane and Walter (author’s parents), 72–74, 85, 154, 156, 174–76

  Baker, Stephen, 182–86, 198–200

  “Barnacle” shoppers, 51–53, 64

  “Barn Raisers” tribe, 78, 80, 81, 83, 88, 93, 126

  Bartering, 26

  Baseball Prospective (website), 27

  Baseball statistics, 27–28

  BBN (company), 144

  Bed sensors, 158, 161, 177

  Behavior

  altering, 13–14, 49–51

  predicting, 3, 12–14, 44, 86–90, 116, 150, 159, 166, 173, 188, 196–98, 201–2, 207

  proxies for, 70, 83–85, 90

  tracking of cell phone users’, 195–97

  tracking of elderly people’s, 154–81

  tracking of Internet users’, 1–6, 17–19, 187, 188

  tracking of terrorists’, 125–26

  tracking shoppers’ patterns of, 41–66, 187

  See also Data; Mathematical models

  “Behavioral markers,” 160

  Beltran, Carlos, 27–28, 40

  Bin Laden, Osama, 124, 149

  Biology and biologists, 7, 11, 14, 58, 142, 160, 167–68, 188–89, 201, 227n

  See also DNA; Genetics

  Black, Fischer, 21

  The Black Swan (Taleb), 126

  Bloggers, 96–122

  Bluetooth data connections, 103–4

  “Bootstrapper” tribe, 88–89

  “Bootstrapping,” 164

  Brands, 10, 47, 50

  Brin, Sergey, 215

  Britain, 143

  “Buckets,” 50–55, 57, 59, 80–81, 105, 115, 128, 187, 205, 207

  See also “Tribes”

  “Builders” (personality type), 189–91

  Bush, George W., 68, 91–92, 95, 114–15, 124, 131

  BusinessWeek (magazine), 195, 221n

  “Butterfly” shoppers, 53, 64

  BuzzMetrics (company), 104, 121

  C

  Cameras (surveillance)

  at Accenture, 44, 63–64

  in casinos, 137–41

  in homes of the elderly, 162–63, 166

  in public places, 4, 43, 63–64, 143–44

  See also Facial recognition; Photos; Surveillance

  Capital IQ, 210, 211

  Capital One, 224n

  Carbonell, Jaime, 61

  Carbon nanotube, 168

  Carley, Kathleen, 35–37

  Carnegie Mellon University (CMU), 13, 35–37, 45–46, 146–48, 211

  Casablanca (movie), 141

  Casinos, 133–41, 144

  Cavaretta, Michael, 226n

  Cell phones

  Bluetooth technology for, 103–4

  data produced by, 4, 5, 16, 195–99

  technical issues associated with, 174

  tracking use of, 35, 130, 195–99

  Central Intelligence Agency. See CIA

  Chávez, Hugo, 211

  Chemistry.com, 182–93, 198, 199–200, 205

  China, 33, 38, 143–45

  ChoicePoint (company), 75

  Chute, Christopher, 228–29n

  CIA (Central Intelligence Agency), 124, 129, 135

  “Civic Sentries” tribe, 82, 87, 93

  Civil liberties. See Privacy

  Clairvoyance Corp., 152

  Clus
tering software, 61–62

  CMU. See Carnegie Mellon University

  Code-breaking, 128–30

  Cold War, 129–30

  Community, 70, 71, 74, 77, 79–80

  Computers

  and algorithms, 222–23n

  on animals, 169–71, 174

  brains compared to, 25

  chips in, 4–5

  cookies on, 2

  cost of, 157

  data produced using, 4–5

  history of uses of, 7–11

  speed of calculations by, 86–87, 112

  teaching, to recognize “tribes,” 59–62

  weaknesses of, 112–13

  and workers, 17–40, 63–64, 97, 106

  See also Algorithms; Computer scientists; Data; Internet; Machine learning; Mathematical models; Mathematicians; Privacy; RFID

  Computer scientists

  competition over hiring of, 145–46

  as making sense of data, 6, 9, 35–37, 129

  and math, 221n

  myths about, 206–14

  See also Computers; Numerati

 

‹ Prev