The Numerati
Page 20
In this new world, all of us are going to face situations in which our most intimate data is exposed, at least to somebody. And we may be interested, or at least willing, to share some of this data. HIV-infected patients, for example, might want to participate in a study and reveal lots of information about their symptoms and their spirits, maybe even their habits, but under one vital condition: that they remain nameless. The personal data can be shared but not the identity.
So we’re going to have to reevaluate our ideas about privacy and secrets. We all have different types of secrets. Some things we tell no one. Others we share with family and a friend or two. Many are secrets in name only; we blab about them all the time. But until recently, our secrets were scattered. The doctor guarded some of them, the banker others. The high school teacher, the dressmaker, the neighbors, the office mates, they all had their allotment. Some existed only in their memories, with certain details escaping, from time to time, into rising and falling streams of gossip. Many were scrawled on receipts or prescriptions, police forms or warnings from school. Most of them, if we played it right, didn’t mingle much. Unless a detective was on the case, the bits of information didn’t find each other. Now they can and will.
This can be scary. No doubt it will tempt a few of us to turn away from the data-spewing world altogether. Some will tiptoe around the Internet, if they venture there at all. They’ll pay with cash, avoiding the trail of credit cards. They might even wait in long lines to throw coins into the toll machines instead of coasting through the automatic readers (which can track many of our movements and even calculate our average speed).
But with a bit of knowledge, we can turn these tools to our advantage. You may not have noticed, but as we make our way in these pages from the snooping workplace to the laboratories of love, we gradually evolve from data serfs into data masters. In the beginning, employers are using the tools to analyze and optimize us as workers. In many of their calculations, we might as well be machines. Advertisers and political operatives gather up our data to plunk us into buckets. But they’re doing it to provide us with more ads and promotions targeted to our tastes and values, to give us more of what we want. That’s a step toward power. Once we’re in Intel’s homehealth labs, hitching sensors to our bodies and wiring our kitchen floors with magic carpets, the balance shifts. We’re appealing to the science of the Numerati to protect us from falls and alert us before strokes and heart attacks. And by the time we’re prowling for love on Chemistry.com, we’ve come full circle. We’re paying for algorithmic profiles of ourselves and lining up mathematical correlations with potential mates. The point is, these statistical tools are going to be quietly assuming more and more power in our lives. We might as well learn how to grab the controls and use them for own interests.
Where to start? In these early years, it’s hard. It involves reading the tiniest print on privacy disclosures at e-commerce sites and on the back of credit-card applications. But as we learn more about the value of our data, and our vulnerabilities, we’ll no doubt clamor for services to help us manage it. That should attract businesses to serve a growing market. One nonprofit organization founded in 2005, AttentionTrust, is leading the way. It provides Web surfers with the tools to amass their own data and to sell it, if they choose, to advertisers. In essence, AttentionTrust is urging people to harvest their own clicks and words—and to stop giving them away to companies like Tacoda, Umbria, and countless others. AttentionTrust hasn’t yet spread much beyond a circle of the Net savvy. And so far, the markets for selling our own data are embryonic. But that could well change as the broader public learns more about how the Numerati are adding us up.
AS I TYPE on a Sunday afternoon, I put on my noise-canceling headphones and jack up a Mahler symphony—all to block out a loud tutoring session upstairs. My 15-year-old is plowing through algebra. This leads me to wonder what he’ll need to learn for a life in which he’ll be measured in a thousands ways, analyzed bit by bit, and then reassembled and optimized by statistical wizards. Does he need to tackle advanced calculus? Should he delve into operations research, learn to manipulate eigenvectors and hidden Markov models? Do he and millions of others need to become Numerati themselves?
In a word, no. Let’s start by taking on three of the enduring myths that misinform this discussion and have done so for centuries, or even longer:
1. The world is divided between word people and number people.
This becomes true only because we let ourselves believe it. Mathematicians and computer scientists, in fact, speak words. Many of the ones I’ve met along my journey were speaking to me in their second or third language. Quite a few were eloquent. And those of us who cloister ourselves on the word side of the divide, who turn the page in a book every time a formula brimming with Greek letters and parentheses pops up (I’m guilty here), we too have minds full of numbers. We’re constantly adding and dividing and carrying out processes whose mathematical names sound alien to many of us. Consider this example. The baby woke up crying at 11 and again at 1, and then at 2:30. Does this mean—we lie in bed doing a quiet regression analysis—that the next cry will come at 3:30?
The key difference between the Numerati and the rest of us lies in the toolbox they carry. It contains sets of mathematical formulas and drawers full of algorithms that mankind has been building for thousands of years. Using this know-how, they attempt to put complex reality into numbers so that theories can be tested and refined. They analyze whether new buildings will stand or bombs will explode, and they handle those traditional tasks on their own, with minimal input from those of us who aren’t handy with such tools (and cringe when confronted with them).
But the new challenges are different. The Numerati must now predict how we humans will respond to car advertisements or a wage hike. The models they build will fall flat if they fail to understand human behavior—if they plug in the wrong data. Figuring out how to boil us down to numbers requires not only the right tools but also the real-world context. That means they must work in teams that draw from different disciplines and include people with all kinds of expertise. There’s plenty of work for anthropologists, linguists, even historians. If there was ever a divide between so-called numbers people and word people, the challenges ahead demolish it.
2. The Numerati are in control. They’ll have their way with us.
Wrong. Even the greatest and most powerful of the Numerati only master certain domains. Everywhere else, they’ll be just like the rest of us: objects of study. Larry Page, for example, is a cofounder of Google and a titan in the world of the Numerati. His scientists are building machines to crunch hundreds of billions of our search queries and clicks, and to sell us, in neatly organized buckets, to advertisers. But when Josh Gotbaum’s political program pours through consumer data and classifies millions of California voters, it plunks Larry Page into a bucket of Still Waters or Right Clicks. Whether they’re patients with a genetic predisposition for blindness or supermarket shoppers with a sky-high tendency to throw a candy bar in the cart, the Numerati are sitting in the databases with the rest of us.
This is a wonderful thing because the people in the best position to exploit our privacy are also gaining an intimate understanding of how their own privacy can be trampled. They understand it better than anyone. This is the dynamic that turned Jeff Jonas, the Las Vegas data maven, into a privacy advocate.
3. Those who master the numbers will make all the money.
They’ll make money, no doubt. But not all of it, not by any stretch. Think back to the dawn of the automobile age. In workshops in Detroit and Stuttgart, engineers were turning out new machines poised to change the course of history. But plenty of people who didn’t know the difference between a piston and an alternator stood to make fortunes from cars. They just had to understand the trends and plan their businesses accordingly. Some of them built suburban subdivisions, malls, and fast-food restaurants where folks could eat in their cars. Some bought land where the highways would be c
oming through, and others sold oil tankers too huge to squeeze through the Panama Canal. Entertainment empires grew around Formula One racing and NASCAR. The motor economy was open to those who saw where things were headed.
That is as true today as it ever was. To make the case, follow me into one more business, Inform Technologies. Its founder, a former banker named Neal Goldman, is hard at work building his second fortune. He’s no algorithm wizard. But he has the imagination to see what the Numerati can build, and he has shown an uncanny ability to find good ones.
In the 1990s, when he was in his twenties, Goldman was an up-and-comer for Lehman Brothers in New York. He worked 120 hours a week, doing cross-border mergers and acquisitions, some of them worth billions of dollars. “It was incredibly intense,” he says. He would regularly pull all-nighters, preparing for early-morning presentations to management. So he’d be pecking away at the computer, getting numbers from Bloomberg, facts from analyst reports, details from annual reports. He was pulling together data, and it took time. “I’d spend a few hours organizing it, putting it into an Excel spreadsheet. Then at around 3 A.M.,” he recalls, “I’d start thinking.” How absurd it was, he realized, that a highly paid professional was spending most of the night hunting down data and plugging it into a spreadsheet. “Out of a 12-hour process, I was spending one hour thinking,” he says.
Goldman saw these all-night headaches at Lehman Brothers as a looming business opportunity. So he quit in 1998 and started a company. His plan was to build a tool to organize and structure all the diverse bits of information he spent those nights hunting. All of the connections had to be a click or two away. Someone studying an investment in a steel mill, for example, should be able to find not only the financial records and stock performance of steel companies—that was easy—but also the key players in the industry, their background, and articles written about them. He should also be able to track the people in the companies, where they’d worked before and gone to school, the connections they shared with board members. The service he envisioned would stitch together the entire web of the world economy, from raw materials to personal relations. For this, he would have to place an immense jumble of information into the same symbolic universe. Goldman was no mathematician, but he knew that if all these pieces of data were going to swim together in the same pool, they would need to be represented in a common language. He needed a geek.
Goldman advertised on a website, and one day a 16-year-old high school student named Joe Einhorn knocked on his door. “He was so shy, he couldn’t look you in the eye,” Goldman says. For a test, he gave the boy “some undoable tasks.” Einhorn reappeared a couple of days later. “He’d stayed up for 48 hours and coded the crap out of the thing.” Goldman had found a greenhorn ready to join the ranks of the Numerati. Joe Einhorn was his first employee. Later, Joe’s brother Jack would sign on. Since the age of 13, he’d been doing cancer research at a New York University program at Veterans Administration, looking for statistical patterns in the expression of a gene involved in the development of prostate cancer.
Eventually the team grew. New partners, investors, and tech specialists climbed aboard. The tool Goldman envisioned, Capital IQ, started to take shape. And it worked. Much of the financial universe was represented within it, in a complex matrix of vectors. All of the data circulated in the same orb, and it was organized by relationships. Want to find Yale grads sitting on corporate boards? Click. How about former Enron executives in the energy business? Click click. Goldman found customers for it, and in 2004, he and his partners sold the company to the Standard & Poor’s unit of McGraw-Hill for $225 million.
By the time I catch up with Goldman, he’s onto his next start-up. His new venture, Inform Technologies, is a Numerati-fueled precision weapon targeted at many of the people I work with: editors. At its root, Inform is much like Capital IQ. It ventures into the tangled, multilingual universe of written news, and it proposes to match readers with the webs of far-flung stories that will interest them. In its early stage, Inform sets out to organize the entire world of news so that every article is linked to every other piece of news that relates to it. A single profile of the Venezuelan strongman Hugo Chávez leads readers along lots of related trails, one about the oil industry, another about revolutions in Latin America, a third about Chávez’s friends and allies in Moscow and Tehran, and one about his rocky relations with Washington. In Inform’s scheme, each piece of news is a thread connecting to an immense and constantly mutating tapestry depicting the world today. It’s ambitious. But that’s just the beginning. In time, the idea is to follow the readers’ clicks and queries and turn them—or us—into statistical profiles. Each one of us will then get a customized stream of news. To create this service, the Inform team—led by the Einhorn brothers—must bring the world’s news onto a single mathematical platform. As Jack Einhorn describes it to me, Inform’s universe of news exists as a sphere of infinite dimensions, with the stories shooting through it as vectors. Each one intersects with the names and themes that they include. Related stories travel in the same clusters in this imaginary space. They intersect. This is similar to the vector-ridden galaxy we encountered in Carnegie Mellon’s “next friend” analysis. But this time, instead of scouring your social networks for a French-speaking lawyer, it may be tracking down just the article you want on changes in French law.
When I think about where I fit in this algorithmic economy, I need look no further than Goldman’s start-up, which carries the Numerati straight into the heart of journalism. The editor he’s building has far more range than the human versions I’ve known. In the world in which I’ve built my career, most of the reporters covering school board meetings, tornados, floods, and wars are young. The idea is that when they grow older and more settled, they’ll be promoted to a higherpaying editing job. In theory, their long experience will help them pick and shape stories that serve and interest their readers. Call it judgment or gut feeling, it’s what sets these editors apart. But as I climb up to Inform’s sixth-floor offices, overlooking 57th Street on Manhattan’s East Side, I’m looking at an operation built to automate editing. If machines handle the editing, what should human editors do? Study math?
Let’s take a closer look at Inform. I walk into the office, and there are 30 workers in four low rows, all of them hunched over computers. None of them look at me. After a minute, Joe Einhorn greets me. He’s in his midtwenties and wears a baseball cap. He leads me into a conference room and asks if I want something to drink. It takes me a minute to realize that I’ve just met the chief science officer. By this point, his baseball cap is bobbing far down one of the aisles. I plug in my laptop and wait for Neal Goldman and Joe’s brother Jack.
Goldman, who’s in his late thirties, wears dark brown hair parted near the middle. The silver zipper on his black turtleneck reaches up to his Adam’s apple. Like Dave Morgan and Tacoda and Howard Kaushansky at Umbria, he harnesses the power of the Numerati without mastering their science. No doubt, he knows a whole lot more math than a liberal arts major like me. He has an MBA and worked in global finance. He understands statistical analysis. But he’s no whiz at set theory, algebraic geometry, or computer science. He harnesses his imagination, which is crucial, and he delegates. “I understand it conceptually,” he says. “I can take a problem, and I can start to parse out what types of data or scoring it would involve. Then I communicate it to people like Jack.”
Goldman’s skills are a lot more relevant, from a career-planning perspective, than the Einhorns’. Those brothers have special ability. Most of us don’t have it and never will. Societies must engage and nurture such talents. These are people, after all, who will be building the next Googles, perhaps vanquishing fearsome forms of cancer and exposing hidden networks of terrorists. For every society, locating these people at a young age, whether they’re living in barrios or working in rice paddies, is an imperative. But that’s a policy issue. For the great majority of us, becoming a math or computer science prodigy i
s not a career option.
So what other jobs are there? I look down those long rows of programmers at Inform and ask Jack Einhorn what kind of skills these people have. Some of them are wizards in their own right, he says. One, his childhood friend Ray, is building autonomous robots to go sniffing out news. Another, a Chinese Ph.D. named Kai, specializes is experimenting with algorithms borrowed from facial recognition technology to pick out similarities in news articles from all over the world. Other workers, both in New York and India, are working on more mundane tasks. They’re building small applications, much like a toolmaker in an auto plant.
Some of these people work—as they say in the software business—with their heads up. They talk to their colleagues on the floor, with designers, maybe even with users or the sales team. They collaborate. As they do, their value rises. Pull them out of the operation and a whole series of connections goes dead. Others are known in the profession as head-down. They’re on their own. In the world we’re entering, head-down workers who are not highly skilled are vulnerable. Since they are not knitted into the greater fabric of the project, they can be more easily replaced, like stand-alone machines, by cheaper head-down workers offshore. After all, numbers and computer codes zip offshore far faster than auto plants do. Math is no safe haven. Only those who like it, and are good at it, should pursue it.