Book Read Free

The Numerati

Page 21

by Stephen Baker


  And the rest of us? We should grasp the basics of math and statistics—certainly better than most of us do today—but still follow what we love. The world doesn’t need millions of mediocre mathematicians, and there’s plenty of opportunity for specialists in other fields. Even in the heart of the math economy, at IBM Research, geometers and engineers work on teams with linguists and anthropologists and cognitive psychologists. They detail the behavior of humans to those who are trying to build mathematical models of it. All of these ventures, from Samer Takriti’s gang at IBM to the secretive researchers laboring behind the barricades at the National Security Agency, feed from the knowledge and smarts of diverse groups. The key to finding a place on such world-class teams is not necessarily to become a math whiz but to become a whiz at something. And that something should be in an area that sparks the most enthusiasm and creativity within each of us. Somewhere on those teams, of course, whether it’s in advertising, publishing, counterterrorism, or medical research, there will be at least a few Numerati. They’ll be the ones distilling this knowledge into numbers and symbols and feeding them to their powerful tools.

  IT’S A SUNNY morning in Palo Alto. I’m having breakfast with a venture capitalist and then a session with Google in the afternoon. The cell phone rings. It’s my old college roommate, a Ph.D. in computer science, who probably forgets every year or two more math than I ever learned. I’m on my very first days on this odyssey, and I have the excitement of someone who’s just stepped into a new world. I tell him what I’m up to and explain, in a sentence or two, how the mathematicians are going to dip into the sea of data to form models of all of us. “This is the mathematical modeling of humanity,” I say. The connection starts to fill with static, but before it goes dead, I hear him say, “I’m going to call you back!”

  A few minutes later, I’m driving north on U.S. 280, looking for the Sandhill Road exit, when the phone rings again. “I’m really worried about your story,” he says. I tell him that I’m heading to interviews, too busy with driving to talk. He tells me to pull over. After I do, he explains that once he too dreamed of modeling the world but has since concluded that math, while powerful, is flawed.

  “Why?”

  “Ever hear of garbage in, garbage out?” His point is that mathematicians model misunderstandings of the world, often using the data at hand instead of chasing down the hidden facts. He tells the story of a drunk looking for his keys on a dark night under a streetlight. He’s looking for them under that lamp not necessarily because he dropped them there but because it’s the only place with light.

  Later that afternoon, I’m sitting at an outdoor patio with Craig Silverstein, Google’s chief technologist. He was the number-one employee at Google. The founders, Larry Page and Sergey Brin, hired him because neither one of them, for all their brilliant ideas, knew much about search engines. It’s sunny and the wind is blowing the pages of my notebook, and I tell Silverstein the story about the drunk looking for his keys.

  He smiles. He’s heard it many times before. He recalls a science fair in junior high, where his project featured lots of good data he’d come up with. “I wanted this data to be significant,” he says. “Finally I found a test that matched it.” The judges, he adds, weren’t fooled.

  Spending all this time among the Numerati, I’ve found myself wondering what jobs the rest of the world will handle in an economy dominated by calculations. Now it occurs to me: it’s up to us to help them find the keys. The mathematicians and computer scientists create magic but only if their formulas contain real, meaningful information from the physical world we inhabit. That’s the way it’s always been, and even as they mine truckloads of data, it’s a team effort. “In the end,” Silverstein says, “you’re just counting things.”

  What’s new, of course, is that many of these “things” the Numerati are busy counting are people. They’re adding us up every which way, and they have all of humanity to model. The rise of this counting elite will convulse entire industries. It’s already happening. At the same time, I suspect it will lead many of us to give more thought to who we are. As we encounter mathematical models built to predict our behavior and divine our deepest desires, it’s only human for each of us to ask, “Did they get it right? Is that really me?”

  Acknowledgments

  * * *

  I’d like to thank my colleagues at BusinessWeek, who gave me so much help and support through this process. It was editor in chief Steve Adler who suggested doing a cover story on math. I thank him for giving me the story and later granting me more than a year’s leave of absence to report and write this book. Neil Gross was my thoughtful and patient editor for the magazine story and a sounding board as I worked on the book. Thanks also to Peter Elstrom, Frank Comes, and John Byrne. My agent, Jim Levine, jumped on this book idea with great enthusiasm and helped tremendously with the proposal. Elizabeth Fisher worked tirelessly on the foreign sales.

  I couldn’t have dreamed for a better editor at Houghton Mifflin than Amanda Cook. She helped shape the book conceptually. It started early in the process when she told me that practically the entire book was sitting in Chapter 4 of my proposal. She was right. During the writing stage, she sent a steady stream of marked-up manuscripts from Boston to Montclair while assuring me on the phone that things really were moving ahead. Thanks also to Susanna Brougham for helping to shape the final text. I’m grateful for the wonderful work of Bridget Marmion, Lori Glazer, Patrice Taddonio, Sanj Kharbanda, and Elizabeth Lee in promoting the book.

  I appreciate all the generous help given to me throughout the process by my sources for this book, both those cited in these pages and the many more who are not. A special thanks to Anne Watzman at Carnegie Mellon University, who introduced me to the World Wide Web and coaxed me in my Pittsburgh days to shift my focus from steel to technology. Thanks too to my neighbor and favorite mathematician, Alfredo Bequillard.

  I’d like to direct a salute to my parents, Mary Jane and Walter, who were both thrilled when the project began and were fully involved in spirit to the very last sentence. My sisters, Judy, Sally, and Carol, were loving and supportive throughout. Thanks as well to my sons, Aidan, Jack, and Henry, and to my wife, Jalaire. She endured the computer-dating ordeal, among other things. And she put up with the clicking of the laptop (which drives her crazy to this day) and my reclining form on the blue couch for more months than either of us would like to count.

  Notes

  * * *

  INTRODUCTION

  2 Gorges on data. Data is a plural form of the singular noun datum. But in many fields, data is treated as a singular noun, just as the singular word sand stands for lots of individual bits of silica. That’s how I use data in this book.

  5 In a single month, Yahoo alone. See “To Aim Ads, Web Is Keeping Closer Eye on You,” New York Times, March 10, 2008.

  6 Crack mathematicians. A word about the genesis of this book. I was pitching a cover story at an editorial meeting at BusinessWeek. The story, I said, would focus on the risks to the technology economy of the United States. Many regions of the world had better Internet connections and superior wireless networks, and they were graduating more scientists and engineers. What’s more, the U.S. economy, which had long attracted some of the brainiest foreigners, was blocking more of them from coming to our country post-9/11. And those same foreigners had plenty of lucrative opportunities at home. The editors found this pitch too familiar. Wasn’t there a fresher way to attack this story? Neil Gross, a senior editor, mentioned that mathematics was at the heart of most of the key technologies. That idea turned into a cover story. Math was fresh. Who wrote cover stories about mathematics?

  I began interviewing mathematicians at MIT, the Courant School at New York University, and Bell Labs. And it quickly became clear to me that writing a story about math was akin to writing one about words. The subject was too vast. So I focused on data, and as I did so, the story swerved from pure math to computer science. Math is a big part of what
the Numerati do, though, and so we kept it in the article title: “Math Will Rock Your World.” See http://www.businessweek.com/magazine/content/06_04/b3968001.htm.

  7 Later, notes Tobias Dantzig. See Dantzig, Number: The Language of Science, p. 7.

  8 A mathematician named Bill Fair. Researchers at Fair Isaac are hoping to use their data-modeling expertise for applications far beyond finance. One potential market is medicine. People who neglect to take their prescribed medications wind up more often at the emergency room with more serious (and expensive) problems. Forgotten or ignored prescriptions, according to Fair Isaac, cost insurance companies an estimated $15.2 billion annually in the United States. So researchers at the company are developing a system to assign to each of us a score denoting the risk that we won’t take our pills.

  Which details from our life predict that we’ll become medical slackers? It might have to do with age, years of schooling, or whether we live alone. There may be statistical correlations among ethnic groups. Fair Isaac’s researchers are poring over the data now. But if in the future they figure out how to predict this risk, we might eventually carry a numerical score for so-called prescription noncompliance. Those of us with high numbers might get a call every day or two from the doctor’s office, reminding us to take our drugs. Maybe they’ll even send someone to knock on our door. Sure it’s pricy, but from the insurance companies’ perspective, it’s cheaper than three weeks in the intensive care unit.

  Fair Isaac is also considering creating numerical scores for all sorts of human qualities, including honesty, generosity, and reliability. Employers, of course, would be interested in some of these numbers. And if the philanthropy industry had access to our generosity scores, they could target more efficient fundraising drives. So far, these are still just ideas.

  13 Composed of numbers, vectors, and algorithms. The word algorithm comes from the name of a ninth-century Persian scientist, Al-Khwarizmi. But algorithms were commonplace long before him. Think of an algorithm as a set of instructions, or a recipe. Brenda Dietrich, the chief of math research at IBM, finds them even on the back of shampoo bottles. “Wash, rinse, repeat. That’s an algorithm,” she says. Algorithms power search engines and marketing campaigns. They schedule the entire major league baseball season. They mete out the hops and barley in a vat of Heineken and the corn syrup and caramel coloring in a tankard of Coke (that algorithm’s a carefully guarded secret).

  The algorithm didn’t rise to its current stardom until the invention of the computer—a machine that demands logical and well-ordered instructions (and is utterly useless without them). With its arrival, an entire branch of applied math and engineering started creating ever more algorithms. These were instructions for counting things, sorting them, making calculations and comparisons, and in short, accomplishing computing tasks. Naturally, many algorithms are packed with statistical analysis. A search engine algorithm, for example, counts how many pages link to each Web page, how often they’re each viewed, and how many times and how prominently the key words appear on each page. It builds a hierarchy on a load of calculations. But the instructions, the foundation of the algorithm, are not based on what we usually think of as mathematics. The keys are clarity and logic, within a rigid set of rules. Lawyers, I’m often told, are good at writing algorithms.

  One nugget from IBM Research: Many algorithms developed in past decades were viewed as theoretical. But with the dramatic advance in computing power, some of them now can be tried out on computers. They’re migrating from theory to practice. This is leading researchers to comb through their archives for hidden algorithmic gems.

  15 Agreed to sell Tacoda to AOL. Tacoda is not the only behavioral advertising company to get scooped up by an Internet giant. In September 2007, Yahoo paid $300 million for BlueLithium, a start-up very much like Tacoda. And the previous May, Microsoft spent $6 billion for aQuantive, an advertising technology company with a behavioral division, DrivePM.

  1. WORKER

  31 He attempted to prove that monogamy. Dantzig, “Discrete-Variable Extremum Problems,” Operations Research, vol. 5, no. 2, April 1957.

  35 Mountains of facts about each employee. “International Isn’t Just IBM’s First Name,” BusinessWeek, Jan. 28, 2008. This article reports that IBM has also developed a search engine called Small Blue to locate suitable employees. “The software scans employees’ blogs, e-mail, instant messages, and reports, then draws conclusions about each participant’s skills and expertise. When other employees search by topic on Small Blue, the program scans its findings to get a list of experts.”

  40 It’s getting late in Takriti’s office. Samer Takriti left his job at IBM in August 2007 to take a position on the math team at Goldman Sachs. We met for lunch later that fall near his office, not far from the ferry terminal at the southern tip of Manhattan. He said that he was ready for a change and had briefly considered other job offers, both at rival banks and at Google. He said he was excited to be working in finance and actively involved in business. It has a faster pace than research. Work on modeling the 50,000 consultants proceeds apace, say officials at IBM.

  2. SHOPPER

  42 They can study our patterns of consumption. One path to understanding humans, oddly enough, starts with so-called horse races, statistical tests to compare our behavior with that of others. They’re a standard of Internet marketing and a hand-me-down from the direct mail industry. In fact, every time we receive a pile of junk mail, we sort through a herd of test horses. Fair Isaac is a leader in helping companies analyze the results. Larry Rosenberger, Fair Isaac’s vice president for research, described the process to me one autumn afternoon at his offices in San Rafael, California.

  Rosenberger walked to the whiteboard to show me how credit card companies run these races. He drew a tall tube. “That’s a customer,” he said. He started drawing lines across it, as if creating little segments on a worm. “You might know his age, gender, income, you might have details on his behavior, what he bought and when. Each of these fields is something about the customer.” He drew another segmented worm, this one representing all the variations in interest rates, penalties, and frequent-flier miles that the credit card company offers. (He called it “the offering vector.”) Companies test each type of offer, some more generous, some less, with each demographic and then study the results. Eventually, the company can figure out the most profitable combination of incentives and rates—and even the wording and design of the pitch—for each group. Actually, I shouldn’t say “most” profitable because these companies always keep testing it against others. Horse races never stop. As we continue to use credit cards, they build up more detailed models of our buying behavior—and send us more horses. They produce more and more data, which can be matched against our growing record—what we buy, where we go, how deep we dive into debt. Some companies have taken this to extremes. Capital One, a leader in microtargeting, has developed more than 100,000 different profiles for credit card offers. Retail took a half-century detour. For a more detailed description of the post—World War II mass economy, see The New Marketing Paradigm, by Don E. Schultz, Stanley I. Tannenbaum, and Robert F. Lauterborn.

  44 Ghani made a splash in 2002. “Mining the Web to Add Semantic Details to Data Mining,” Springer Lecture Notes in Artificial Intelligence, vol. 3209, 2004.

  55 Think of buckets as genes. The language of genetics pervades the science of the Numerati. I did a search through my notes and found the word genome mentioned 139 times. The architectural term blueprint, which in the figurative sense is a synonym, popped up only 13 times. In one example among many, Martin Remy, the chief technologist at the San Francisco search start-up Sphere, says that his team develops “document genomes,” a combination of features “that lets us find other genetic matchings for documents.”

  58 One researcher at Microsoft. For more on Heckerman, read “Using Spam Blockers to Target HIV, Too,” BusinessWeek, Oct. 1, 2007.

  3. VOTER

  68 Joined with two coauth
ors to detail this triumph. See Applebee’s America, by Matthew J. Dowd, Ron Fournier, and Douglas B. Sosnick, Simon & Schuster, 2006.

  69 Each sliver of the electorate. Even political data mavens disagree about the value of microtargeting. My reporting took me to the Washington offices of Hal Malchow, a consultant who began data mining for voters back in the 1990s. This made him a mossback among the political Numerati. He said that despite all the excitement about consumer data, the most useful variables remained those that my father might have recognized as he worked to turn out voters for Richard Nixon in 1960. “These six things matter the most,” Malchow told me:

  1. Ethnicity. (Whites, blacks, Jews, and Catholics have different voting patterns.)

  2. Gender. (In recent presidential races, a majority of men have voted Republican.)

  3. Marital status. (Democrats do best among single women in a landslide.)

  4. Church attendance. (The pious are more conservative.)

  5. Gun ownership. (Conservatives, with a libertarian streak, tend to own guns.)

  6. Geography. (The higher the population density, the more liberal the voters.)

  Microtargeters do not challenge the significance of this list but insist that they can pry loose atypical individuals within these groups. Malchow, by contrast, argued that many efforts outside of these core areas amounted to marketing hype.

  Another note from Malchow: Although African Americans represent a core Democratic constituency, the party lacks reliable lists of black voters. “The myth is that we have African Americans,” he said. “We don’t.” Unlike Hispanics, they don’t have distinctive surnames. This leads list builders to search for first names they associate with African Americans, such as Latisha and Jamal. In the process, of course, they miss millions of Roberts, Janes, Toms, and Alices.

 

‹ Prev