“With AI, what we’re talking about is getting much quicker to the limit of what is physically possible,” said Nate. “And one thing that will obviously become possible in this space is the uploading of human minds.”
It was Nate’s belief that, should we manage to evade annihilation by the machines, such a state of digital grace would inevitably be ours. There was, in his view, nothing mystical or fantastical about this picture of things, because there was, as he put it, “nothing special about carbon.” Like everything else in nature—like trees, for instance, which he described as “nanotech machines that turn dirt and sunlight into more trees”—we were ourselves mechanisms.
Once we had enough computing power, he said, we would be able to simulate in full, down to the quantum level, everything that our brains were doing in their current form, the meat form.
Some version of this functionalist view of cognition was common among AI researchers, and was, in a sense, central to the entire project: the mind was a program distinguished not by its running on the sophisticated computing device of the brain, but by the operations it was capable of carrying out. (The project that people like Randal Koene and Todd Huffman were working toward now, for all its vast complexity and remoteness, could be achieved in the course of a long weekend by the kind of artificial superintelligence Nate was talking about.)
It was easy to forget, Nate continued, that as we sat here talking we were in fact using nanotech protein computers to conduct this exchange of ours, this transfer of data; and as he said this I wondered whether this conviction came so naturally to Nate, and to people like him, because their brains, their minds, functioned in the sort of logical, rigorously methodical way that made this metaphor seem intuitively accurate, made it seem, finally, not to be a metaphor at all. Whereas I myself had trouble thinking of my brain as a computer or any other kind of mechanism; if it were one, I’d be looking to replace it with a better model, because it was a profoundly inefficient device, prone to frequent crashes and dire miscalculations and lengthy meanderings on its way toward goals that it was, in the end, as likely as not to abandon anyway. Perhaps I was so resistant to this brain-as-computer idea because to accept it would be to necessarily adopt a model in which my own way of thinking was essentially a malfunction, a redundancy, a system failure.
There was something insidious about this tendency—of transhumanists, of Singularitarians, of techno-rationalists in general—to refer to human beings as though they were merely computers built from protein, to insist that the brain, as Minsky had put it, “happens to be a meat machine.” (Something I’d read earlier that day on Nate’s Twitter timeline, on which it was his custom to quote things overheard around the MIRI offices: “This is what happens when you run programs that fucked themselves into existence on computers made of meat.”) This rhetoric was viscerally unpleasant, I felt, because it reduced the complexity and strangeness of human experience to a simplistic instrumentalist model of stimulus and response, and thereby opened an imaginative space—an ideological space, really—wherein humans could very well be replaced, versioned out by more powerful machines, because the fate of all technologies was, in the end, to be succeeded by some device that was more sophisticated, more useful, more effective in its execution of its given tasks. The whole point of technology per se was to make individual technologies redundant as quickly as possible. And in this techno-Darwinist view of our future, as much as we would be engineering our own evolution, we would be creating our own obsolescence. (“We are ourselves creating our own successors,” wrote the English novelist Samuel Butler in 1863, in the wake of the Industrial Revolution, and four years after On the Origin of Species. “Man will become to the machine what the horse and dog are to man.”)
But there was something else, too, something seemingly trivial but somehow more deeply disturbing: the disgust that arose from the conjunction of two ostensibly irreconcilable systems of imagery, that of flesh and that of machinery. And the reason, perhaps, that this union excited in me such an elemental disgust was that, like all taboos, it brought forth something that was unspeakable precisely because of its proximity to the truth: the truth, in this case, that we were in fact meat, and that the meat that we were was no more or less than the material of the machines that we were, equally and oppositely. And in this sense there really was nothing special about carbon, in the same way that there was nothing special, nothing necessary, about the plastic and glass and silicon of the iPhone on which I was recording Nate Soares saying that there was nothing special about carbon.
And so the best-case scenario of the Singularitarians, the version of the future in which we merge with artificial superintelligence and become immortal machines, was, for me, no more appealing, in fact maybe even less appealing, than the worst-case scenario, in which artificial superintelligence destroyed us all. And it was this latter scenario, the failure mode as opposed to the God mode, that I had come to learn about, and that I was supposed to be getting terrified about—as I felt confident that I would, in due course, once I was able to move on from feeling terrified about the best-case scenario.
As he spoke, Nate methodically uncapped and recapped with his thumb the lid of the red marker he was using to illustrate on the whiteboard certain facts and theories he was explaining to me: about how, for instance, once we got to the point of developing a human-level AI, it would then logically follow that AI would very soon be in a position to program further iterations of itself, igniting an exponentially blazing inferno of intelligence that would consume all of creation.
“Once we can automate computer science research and AI research,” he said, “the feedback loop closes and you start having systems that can themselves build better systems.”
This was perhaps the foundational article of faith in the AI community, the idea that lay beneath both the ecstasies of the Singularity and the terror of catastrophic existential risk. It was known as the intelligence explosion, a notion that had first been introduced by the British statistician I. J. Good, a former Bletchley Park cryptographer who went on to advise Stanley Kubrick on his vision of AI in 2001: A Space Odyssey. In a paper called “Speculations Concerning the First Ultraintelligent Machine,” delivered at a NASA conference in 1965, Good outlined the prospect of a strange and unsettling transformation that was likely to come with the advent of the first human-level AI. “Let an ultraintelligent machine be defined,” he wrote, “as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind.”
The idea, then, is that this thing of our creation would be the ultimate tool, the teleological end point of a trajectory that began with the hurling of the first spear—“the last invention that man need ever make.” Good believed that such an invention would be necessary for the continued survival of the species, but that disaster could only be avoided if the machine were “docile enough to tell us how to keep it under control.”
This mechanism, docile or otherwise, would be operating at an intellectual level so far above that of its human progenitors that its machinations, its mysterious ways, would be impossible for us to comprehend, in much the same way that our actions are, presumably, incomprehensible to the minds of the rats and monkeys we use in scientific experiments. And so this intelligence explosion would, in one way or another, be the end of the era of human dominance—and very possibly the end of human existence.
“It is unreasonable,” as Minsky had put it, “to think that machines could become nearly as intelligent as we are and then stop, or to suppose that we will always be able to compete with them in wit or wisdom. Whether or not we could retain some sort of control of the machines, assuming that we would want to, the nature of activities or aspirations would be changed utterly by the presence on earth of intellectually s
uperior beings.”
This is, in effect, the basic idea of the Singularity, and of its dark underside of catastrophic existential risk. The word “singularity” is primarily a term from physics, where it refers to the point at the exact center of a black hole, where the density of matter becomes infinite, and in which the laws of space-time begin to break down.
“It gets very hard to predict the future once you have smarter-than-human things around,” said Nate. “In the same way that it gets very hard for a chimp to predict what is going to happen because there are smarter-than-chimp things around. That’s what the Singularity is: it’s the point past which you expect you can’t see.”
Despite this conviction that the post-AI future would be even more difficult to predict than the garden-variety future—which was itself notoriously difficult to say anything useful or accurate about—it was Nate’s reading of the situation that, whatever wound up happening, it was highly unlikely not to involve humanity being clicked and dragged to the trash can of history.
What he and his colleagues—at MIRI, at the Future of Humanity Institute, at the Future of Life Institute—were working to prevent was the creation of an artificial superintelligence that viewed us, its creators, as raw material that could be reconfigured into some more useful form (not necessarily paper clips). And the way Nate spoke about it, it was clear that he believed the odds to be stacked formidably high against success.
“To be clear,” said Nate, “I do think that this is the shit that’s gonna kill me.”
I was, for some reason, startled to hear it put so bluntly. It made sense, obviously, that Nate would take this threat as seriously as he did. I knew well enough that this was not an intellectual game for people like him, that they truly believed this to be a very real possibility for the future. And yet the idea that he imagined himself more likely to be killed by an ingenious computer program than to die of cancer or heart disease or old age seemed, in the final analysis, basically insane. He had presumably arrived at this position by the most rational of routes—though I understood almost nothing of the mathematical symbols and logic trees he’d scrawled on the whiteboard for my benefit, I took them as evidence of such—and yet it seemed to me about as irrational a position as it was possible to occupy. Not for the first time, I was struck by the way in which absolute reason could serve as the faithful handmaiden of absolute lunacy. But then again, perhaps I was the one who was mad, or at least too dim-witted, too hopelessly uninformed to see the logic of this looming apocalypse.
“Do you really believe that?” I asked him. “Do you really believe that an AI is going to kill you?”
Nate nodded curtly, and clicked the cap back on his red marker.
“All of us,” he said. “That’s why I left Google. It’s the most important thing in the world, by some distance. And unlike other catastrophic risks—like say climate change—it’s dramatically underserved. There are thousands of person-years and billions of dollars being poured into the project of developing AI. And there are fewer than ten people in the world right now working full-time on safety. Four of whom are in this building. It’s like there are thousands of people racing to be the first to develop nuclear fusion, and basically no one is working on containment. And we have got to get containment working. Because there are a lot of very clever people building something that, the way they are approaching it at present, will kill us all if they succeed.”
“So as things stand,” I said, “we’re more likely to be wiped out by this technology than not, is what you’re saying?”
“In the default scenario, yes,” said Nate, and placed the red marker on his desk, balancing it upright like a prelaunch missile. His manner was, I felt, remarkably detached for someone who was talking about my death, or the death of my son, or of my future grandchildren—not to mention every other human being unlucky enough to be around for this imminent apocalypse. He seemed as though he were talking about some merely technical problem, some demanding, exacting bureaucratic challenge—as, in some sense, he was.
“I’m somewhat optimistic,” he said, leaning back in his chair, “that if we raise more awareness about the problems, then with a couple more rapid steps in the direction of artificial intelligence, people will become much more worried that this stuff is close, and AI as a field will wake up to this. But without people like us pushing this agenda, the default path is surely doom.”
For reasons I find difficult to identify, this term default path stayed with me all that morning, echoing quietly in my head as I left MIRI’s offices and made for the BART station, and then as I hurtled westward through the darkness beneath the bay. I had not encountered the phrase before, but understood intuitively that it was a programming term of art transposed onto the larger text of the future. And this term default path—which, I later learned, referred to the list of directories in which an operating system seeks executable files according to a given command—seemed in this way to represent in miniature an entire view of reality: an assurance, reinforced by abstractions and repeated proofs, that the world operated as an arcane system of commands and actions, and that its destruction or salvation would be a consequence of rigorously pursued logic. It was exactly the sort of apocalypse, in other words, and exactly the sort of redemption, that a computer programmer would imagine.
—
And what was the nature of this rigorously pursued logic? What was it that was needed to prevent this apocalypse?
What was needed, first of all, was what was always needed: money, and clever people. And luckily, there were a number of people with sufficient money to fund a number of people who were sufficiently clever. A lot of MIRI’s funding came in the form of smallish donations from concerned citizens—people working in tech, largely: programmers and software engineers and so on—but they also received generous endowments from billionaires like Peter Thiel and Elon Musk.
The week that I visited MIRI happened to coincide with a huge conference, held at Google’s headquarters in Mountain View, organized by a group called Effective Altruism—a growing social movement, increasingly influential among Silicon Valley entrepreneurs and within the rationalist community, which characterized itself as “an intellectual movement that uses reason and evidence to improve the world as much as possible.” (An effectively altruistic act, as opposed to an emotionally altruistic one, might involve a college student deciding that, rather than becoming a doctor and spending her career curing blindness in the developing world, her time would be better spent becoming a Wall Street hedge fund manager and donating enough of her income to charity to pay for several doctors to cure a great many more people of blindness.) The conference had substantially focused on questions of AI and existential risk. Thiel and Musk, who’d spoken on a panel at the conference along with Nick Bostrom, had been influenced by the moral metrics of Effective Altruism to donate large amounts of money to organizations focused on AI safety.
Effective Altruism had significant crossover, in terms of constituency, with the AI existential risk movement. (In fact, the Centre for Effective Altruism, the main international promoter of the movement, happened to occupy an office in Oxford just down the hall from the Future of Humanity Institute.)
It seemed to me odd, though not especially surprising, that a hypothetical danger arising from a still nonexistent technology would, for these billionaire entrepreneurs, be more worthy of investment than, say, clean water in the developing world or the problem of grotesque income inequality in their own country. It was, I learned, a question of return on investment—of time, and money, and effort. The person I learned this from was Viktoriya Krakovna, the Harvard mathematics PhD student who had cofounded—along with the MIT cosmologist Max Tegmark and Skype founder Jann Tallinn—the Future of Life Institute, which earlier that year had received an endowment of $10 million from Musk in order to establish a global research initiative aimed at averting AI catastrophe.
“It is about how much bang you get for your buck,” she said, the American idio
m rendered strange by her Ukrainian accent, with its percussive plosives, its throttled vowels. She and I and her husband, Janos, a Hungarian-Canadian mathematician and former research fellow at MIRI, were the only diners in an Indian restaurant on Berkeley’s Shattuck Avenue, the kind of cavernously un-fancy setup that presumably tended to cater to drunken undergraduates. Viktoriya spoke between forkfuls of an extremely spicy chicken dish, which she consumed with impressive speed and efficiency. Her manner was confident but slightly remote, and, as with Nate, characterized by a minimal quantity of eye contact.
She and Janos were in the Bay Area for the Effective Altruism conference; they lived in Boston, in a kind of rationalist commune called Citadel; they had met ten years ago at a high school math camp, and had been together since.
“The concerns of existential risk fit into that value metric,” elaborated Viktoriya. “If you consider balancing the interests of future people against those who already exist, reducing the probability of a major future catastrophe can be a very high-impact decision. If you succeed in avoiding an event that might wipe out all of future humanity, that clearly exceeds any good you might do for people currently living.”
The Future of Life Institute was less focused than MIRI on the mathematical arcana of how a “friendly AI” might be engineered. The group, she said, functioned as “the outreach arm of this cluster of organizations,” raising awareness about the seriousness of this problem. It was not the attention of the media or the general public for which FLI was campaigning, Viktoriya said, but rather that of AI researchers themselves, a constituency in which the idea of existential risk was only just beginning to be taken seriously.
One of the people who had been most instrumental in its being taken seriously was Stuart Russell, a professor of computer science at U.C. Berkeley who had, more or less literally, written the book on artificial intelligence. (He was the coauthor, with Google’s research director Peter Norvig, of Artificial Intelligence: A Modern Approach, the book most widely used as a core AI text in university computer science courses.)
To Be a Machine Page 10