“If you want to compare [AI] to the space program, we’re at Galileo,” Tenenbaum said. “We’re not yet at Newton.” He is convinced that while ongoing research into the brain is shining a light on intelligence, the larger goal—to reverse engineer human thought—will require immense effort and time. An enormous amount of science awaits before the engineering phase can begin. “The problem is exponentially harder [than manned space flight],” he said. “I wouldn’t be surprised if it took a couple hundred years.”
No wonder, you might say, that IBM opted for a more rapid approach. Yet even as Tenenbaum and others engage in quasi-theological debates about the future of intelligence, many are already snatching ideas from the brain to build what they can in the here-and-now. Tenenbaum’s own lab, using statistical formulas inspired by brain functions, is training computers to sort through scarce data and make predictions about everything from the location of oil deposits to suicide bombing attacks. For this, he hopes to infuse the machines with a thread or two of logic inspired by observations of the brain, helping them to connect dots the way people do. At the same time, legions of theorists, focused on the exponential advances in computer technology, are predicting that truly smart machines, also inspired by the brain, will be arriving far ahead of Tenenbaum’s timetable. They postulate that within decades, computers more intelligent than humans will dramatically alter the course of human evolution.
Meanwhile, other scientists in the field pursue a different type of question-answering system—a machine that actually knows things. For two generations, an entire community in AI has tried to teach computers about the world, describing the links between oxygen and hydrogen, Indiana and Ohio, tables and chairs. The goal is to build knowledge engines, machines very much like Watson but capable of much deeper reasoning. They have to know things and understand certain relationships to come up with insights. Could the emergence of a data-crunching wonder like Watson short-circuit their research? Or could their work help Watson grow from a dilettante into a scholar?
In the first years of the twenty-first century, Paul Allen, the cofounder of Microsoft, was pondering Aristotle. For several decades in the fourth century BC, that single Greek philosopher was believed to hold most of the world’s scientific knowledge in his head. Aristotle was like the Internet and Google combined. He stored the knowledge and located it. In a sense, he outperformed the Internet because he combined his factual knowledge with a mastery of language and context. He could answer questions fluently, and he was reputedly a wonderful teacher.
This isn’t to say that as an information system, Aristotle had no shortcomings. First, the universe of scientific knowledge in his day was tiny. (Otherwise it wouldn’t have fit into one head, no matter how brilliant.) What’s more, the bandwidth in and out of his prodigious mind was severely limited. Only a small group of philosophers and students (including the future Alexander the Great) enjoyed access to it, and then only during certain hours of the day, when the philosopher turned his attention to them. He did have to study, after all. Maintaining omniscience—or even a semblance of it—required hard work.
For perhaps the first time since the philosopher’s death, as Allen saw it, a single information system—the Internet —could host the universe of scientific knowledge, or at least a big chunk of it. But how could people gain access to this treasure, learn from it, winnow the truth from fiction and innuendo? How could computers teach us? The solution, it seemed to him, was to create a question-answering system for science, a digital Aristotle.
For years, Allen had been plowing millions into research on computing and the human brain. In 2003, he directed his technology incubator, Vulcan Inc., of Seattle, to sponsor long-range research to develop a digital Aristotle. The Vulcan team called it Project Halo. This scientific expert, they hoped, would fill a number of roles, from education to research. It would answer questions for students, maybe even develop a new type of interactive textbook. And it would serve as an extravagantly well-read research assistant in laboratories.
For Halo to succeed in these roles, it needed to do more than simply find things. It had to weave concepts together. This meant understanding, for example, that when water reaches 100 degrees centigrade it turns into steam and behaves very differently. Plenty of computers could impart that information. But how many could incorporate such knowledge into their analysis and reason from it? The idea of Halo was to build a system that, at least by a liberal definition of the word, could think.
The pilot project was to build a computer that could pass the college Advanced Placement tests in chemistry. Chemistry, said Noah Friedland, who ran the project for Vulcan, seemed like the ideal subject for a computer. It was a hard science “without a lot of psychological interpretations.” Facts were facts, or at least closer to them in chemistry than in squishier domains, like economics. And unlike biology, in which tissue scans and genomic research were unveiling new discoveries every month or two, chemistry was fairly settled. Halo would also sidestep the complications that came with natural language. Vulcan let the three competing companies, two American and one German, translate the questions from English into a logic language that their systems could understand. At some point in the future, they hoped, this digital Aristotle would banter back and forth in human languages. But in the four-month pilot, it just had to master the knowledge and logic of high school chemistry.
The three systems passed the test, albeit with middling scores. But if you looked at the process, you’d hardly know that machines were involved. Teaching chemistry to these systems required a massive use of human brainpower. Teams of humans—knowledge engineers—had to break down the fundamentals of chemistry into components that the computers could handle. Since the computer couldn’t develop concepts on its own, it had to learn them as exhaustive lists and laws. “We looked at the cost, and we said, ‘Gee, it costs $10,000 per textbook page to formulate this knowledge,’” Friedland said.
It seemed ludicrous. Instead of enlisting machines to help sort through the cascades of new scientific information, the machines were enlisting humans to encode the tiniest fraction of it—and at a frightful cost. The Vulcan team went on to explore ways in which thousands, or even millions, of humans could teach these machines more efficiently. In their vision, entire communities of experts would educate these digital Aristotles, much the way online communities were contributing their knowledge to create Wikipedia. Work has continued through the decade, but the two principles behind the Halo thinking haven’t changed: First, smart machines require smart teachers, and only humans are up to the job. Second, to provide valuable answers, these computers have to be fed factual knowledge, laws, formulas, and equations.
Not everyone agrees. Another faction, closely associated with search engines, is approaching machine intelligence from a different angle. They often remove human experts from the training process altogether and let computers, guided by algorithms, study largely on their own. These are the statisticians. They’re closer to Watson’s camp. For decades, they’ve been at odds with their rule-writing colleagues. But their approach registered a dramatic breakthrough in 2005, when the U.S. National Institute for Standards and Technologies held one of its periodic competitions on machine translation. The government was ravenous for this translation technology. If machines could automatically monitor and translate Internet traffic, analysts might get a jump on trends in trade and technology and, even more important, terrorism. The competition that year focused on machine translation from Chinese and Arabic into English. And it drew the usual players, including a joint team from IBM and Carnegie Mellon and a handful of competitors from Europe. Many of these teams, with their blend of experts in linguistics, cognitive psychology, and computer science, had decades of experience working on translations.
One new player showed up: Google. The search giant had been hiring experts in machine translation, but its team differed from the others in one aspect: No one was expert in Arabic or Chinese. Forget the nuances of language. The
y would do it with math. Instead of translating based on semantic and grammatical structure, the interplay of the verbs and objects and prepositional phrases, their computers were focusing purely on statistical relationships. The Google team had fed millions of translated documents, many of them from the United Nations, into their computers and supplemented them with a multitude of natural-language text culled from the Web. This training set dwarfed their competitors’. Without knowing what the words meant, their computers had learned to associate certain strings of words in Arabic and Chinese with their English equivalents. Since they had so very many examples to learn from, these statistical models caught nuances that had long confounded machines. Using statistics, Google’s computers won hands down. “Just like that, they bypassed thirty years of work on machine translation,” said Ed Lazowska, the chairman of the computer science department at the University of Washington.
The statisticians trounced the experts. But the statistically trained machines they built, whether they were translating from Chinese or analyzing the ads that a Web surfer clicked, didn’t know anything. In that sense, they were like their question-answering cousins, the forerunners of the yet-to-be-conceived Jeopardy machine. They had no response to different types of questions, ones they weren’t programmed to answer. They were incapable of reasoning, much less coming up with ideas.
Machines were seemingly boxed in. When people taught them about the world, as in the Halo project, the process was too slow and expensive and the machines ended up “overfitted”—locked into single interpretations of facts and relationships. Yet when machines learned for themselves, they turned everything into statistics and remained, in their essence, ignorant.
How could computers get smarter about the world? Tom Mitchell, a computer science professor at Carnegie Mellon, had an idea. He would develop a system that, just like millions of other students, would learn by reading. As it read, it would map all the knowledge it could make sense of. It would learn that Buenos Aires appeared to be a city, and a capital too, and for that matter also a province, that it fit inside Argentina, which was a country, a South American country. The computer would perform the same analysis for billions of other entities. It would read twenty-four hours a day, seven days a week. It would be a perpetual reading machine, and by extracting information, it would slowly cobble together a network of knowledge: every president, continent, baseball team, volcano, endangered species, crime. Its curriculum was the World Wide Web.
Mitchell’s goal was not to build a smart computer but to construct a body of knowledge—a corpus—that smart computers everywhere could turn to as a reference. This computer, he hoped, would be doing on a global scale what the human experts in chemistry had done, at considerable cost, for the Halo system. Like Watson, Mitchell’s Read-the-Web computer, later called NELL, would feature a broad range of analytical tools, each one making sense of the readings from its own perspective. Some would compare word groups, others would parse the grammar. “Learning method A might decide, with 80 percent probability, that Pittsburgh is a city,” Mitchell said. “Method C believes that Luke Ravenstahl is the mayor of Pittsburgh.” As the system processed these two beliefs, it would find them consistent and mutually reinforcing. If the entity called Pittsburgh had a mayor, there was a good chance it was a city. Confidence in that belief would rise. The computer would learn.
Mitchell’s team turned on NELL in January 2010. It worked on a subsection of the Web, a cross section of two hundred million Web pages that had been culled and curated by Mitchell’s colleague Jamie Callan. (Operating with a fixed training set made it easier in the early days to diagnose troubles and carry out experiments.) Within six months, the machine had developed some four hundred thousand beliefs—a minute fraction of what it would need for a global knowledge base. But Mitchell saw NELL and other fact-hunting systems growing quickly. “Within ten years,” he predicted, “we’ll have computer programs that can read and extract 80 percent of the content of the Web, which itself will be much bigger and richer.” This, he said, would produce “a huge knowledge base that AI can work from.”
Much like Watson, however, this knowledge base would brim with beliefs, not facts. After all, statistical systems merely develop confidence in facts as a calculation of probability. They believe, to one degree or another, but are certain of nothing. Humans, by contrast, must often work from knowledge. Halo’s Friedland (who left Vulcan to set up his own shop in 2005) argues that AI systems informed by machine learning will end up as dilettantes, like Watson (at least in its Jeopardy incarnation). A computer, he said, can ill afford to draw conclusions about a jet engine turbine based on “beliefs” about bypass ratios or the metallurgy of titanium alloys. It has to know those things.
So when it came to teaching knowledge machines at the end of the first decade of the twenty-first century, it was a question of picking your poison. Computers that relied on human teachers were slow to learn and frightfully expensive to teach. Those that learned automatically unearthed possible answers with breathtaking speed. But their knowledge was superficial, and they were unable to reason from it. The goal of AI—to marry the speed and range of machines with the depth and subtlety of the human brain—was still awaiting a breakthrough. Some believed it was at hand.
In 1859, the British writer Samuel Butler sailed from England, the most industrial country on earth, to the wilds of New Zealand. There, for a few years, he raised sheep. He was as far away as could be, on the antipodes, but he had the latest books shipped to him. One package included the new work by Charles Darwin, On the Origin of Species. Reading it led Butler to contemplate humanity in an evolutionary context. Presumably, humans had developed through millions of years, and their rhythms, from the perspective of his New Zealand farm, appeared almost timeless. Like sheep, people were born, grew up, worked, procreated, died, and didn’t change much. If the species evolved from one century to the next, it was imperceptible. But across the seas, in London, the face of the earth was changing. High-pressure steam engines, which didn’t exist when his parents were born, were powering trains across the countryside. Information was speeding across Europe through telegraph wires. And this was just the beginning. “In these last few ages,” he wrote, referring to machines, “an entirely new kingdom has sprung up, of which we as yet have only seen what will one day be considered the antediluvian prototypes of the race.” The next step of human evolution, he wrote in an 1863 letter to the editor of a local newspaper, would be led by the progeny of steam engines, electric turbines, and telegraphs. Human beings would eventually cede planetary leadership to machines. (Not to fear, he predicted: The machines would care for us, much the way humans tended to lesser beings.)
What sort of creature [is] man’s next successor in the supremacy of the earth likely to be? We have often heard this debated; but it appears to us that we are ourselves creating our own successors; we are daily adding to the beauty and delicacy of their physical organisation; we are daily giving them greater power and supplying by all sorts of ingenious contrivances that self-regulating, self-acting power which will be to them what intellect has been to the human race. In the course of ages we shall find ourselves the inferior race.
Butler’s vision, and others like it, nourished science fiction for more than a century. But in the waning years of the twentieth century, as the Internet grew to resemble a global intelligence and computers continued to gain in power, legions of technogeeks and philosophers started predicting that the age of machines was almost upon us. They called it the Singularity, a hypothetical time in which progress in technology would feed upon itself feverishly, leading to transformational change.
In August 2010, hundreds of computer scientists, cognitive psychologists, futurists, and curious technophiles descended on San Francisco’s Hyatt hotel, on the Embarcadero, for the two-day Singularity Summit. For most of these people, programming machines to catalogue knowledge and answer questions, whether manually or by machine, was a bit pedestrian. They weren’t looki
ng for advances in technology that already existed. Instead, they were focused on a bolder challenge, the development of deep and broad machine intelligence known as Artificial General Intelligence. This, they believed, would lead to the next step of human evolution.
The heart of the Singularity argument, as explained by the technologists Vernor Vinge and Ray Kurzweil, the leading evangelists of the concept, lies in the power of exponential growth. As Samuel Butler noted, machines evolve far faster than humans. But information technology, which Butler only glimpsed, races ahead at an even faster rate. Digital tools double in power or capacity every year or two, whether they are storing data, transmitting it, or performing calculations. A single transistor cost $1 in 1968; by 2010 that same buck could buy a billion of them. This process, extended into the future, signaled that sometime in the third decade of this century, computers would rival or surpass the power and complexity of the human brain. At that point, conceivably, machines would organize our affairs, come up with groundbreaking ideas, and establish themselves as the cognitive leaders of the planet.
Many believed these machines were yet to be invented. They would come along in a decade or two, powered by new generations of spectacularly powerful semiconductors, perhaps fashioned from exotic nanomaterials, the heirs to silicon. And they would feature programs to organize knowledge and generate language and ideas. Maybe the hardware would replicate the structure of the human brain. Maybe the software would simulate its patterns. Who knew? Whatever its configuration, perhaps a few of Watson’s algorithms or an insight from Josh Tenenbaum’s research would find their way into this machinery.
Final Jeopardy Page 15