Rationality- From AI to Zombies

Page 146

by Eliezer Yudkowsky

So, in the beginning, I made the same mistake: I didn’t understand intelligence, so I imagined throwing a Manhattan Project at the problem.

But, having calculated the planetary death rate at 55 million per year or 150,000 per day, I did not turn around and run away from the big scary problem like a frightened rabbit. Instead, I started trying to figure out what kind of AI project could get there fastest. If I could make the intelligence explosion happen one hour earlier, that was a reasonable return on investment for a pre-explosion career. (I wasn’t thinking in terms of existential risks or Friendly AI at this point.)

So I didn’t run away from the big scary problem like a frightened rabbit, but stayed to see if there was anything I could do.

Fun historical fact: In 1998, I’d written this long treatise proposing how to go about creating a self-improving or “seed” AI (a term I had the honor of coining). Brian Atkins, who would later become the founding funder of the Machine Intelligence Research Institute, had just sold Hypermart to Go2Net. Brian emailed me to ask whether this AI project I was describing was something that a reasonable-sized team could go out and actually do. “No,” I said, “it would take a Manhattan Project and thirty years,” so for a while we were considering a new dot-com startup instead, to create the funding to get real work done on AI . . .

A year or two later, after I’d heard about this newfangled “open source” thing, it seemed to me that there was some preliminary development work—new computer languages and so on—that a small organization could do; and that was how MIRI started.

This strategy was, of course, entirely wrong.

But even so, I went from “There’s nothing I can do about it now” to “Hm . . . maybe there’s an incremental path through open-source development, if the initial versions are useful to enough people.”

This is back at the dawn of time, so I’m not saying any of this was a good idea. But in terms of what I thought I was trying to do, a year of creative thinking had shortened the apparent pathway: The problem looked slightly less impossible than it had the very first time I’d approached it.

The more interesting pattern is my entry into Friendly AI. Initially, Friendly AI hadn’t been something that I had considered at all—because it was obviously impossible and useless to deceive a superintelligence about what was the right course of action.

So, historically, I went from completely ignoring a problem that was “impossible,” to taking on a problem that was merely extremely difficult.

Naturally this increased my total workload.

Same thing with trying to understand intelligence on a precise level. Originally, I’d written off this problem as impossible, thus removing it from my workload. (This logic seems pretty deranged in retrospect—Nature doesn’t care what you can’t do when It’s writing your project requirements—but I still see AI folk trying it all the time.) To hold myself to a precise standard meant putting in more work than I’d previously imagined I needed. But it also meant tackling a problem that I would have dismissed as entirely impossible not too much earlier.

Even though individual problems in AI have seemed to become less intimidating over time, the total mountain-to-be-climbed has increased in height—just like conventional wisdom says is supposed to happen—as problems got taken off the “impossible” list and put on the “to do” list.

I started to understand what was happening—and what “Persevere!” really meant—at the point where I noticed other AI folk doing the same thing: saying “Impossible!” on problems that seemed eminently solvable—relatively more straightforward, as such things go. But they were things that would have seemed vastly more intimidating at the point when I first approached the problem.

And I realized that the word “impossible” had two usages:

Mathematical proof of impossibility conditional on specified axioms;

“I can’t see any way to do that.”

Needless to say, all my own uses of the word “impossible” had been of the second type.

Any time you don’t understand a domain, many problems in that domain will seem impossible because when you query your brain for a solution pathway, it will return null. But there are only mysterious questions, never mysterious answers. If you spend a year or two working on the domain, then, if you don’t get stuck in any blind alleys, and if you have the native ability level required to make progress, you will understand it better. The apparent difficulty of problems may go way down. It won’t be as scary as it was to your novice-self.

And this is especially likely on the confusing problems that seem most intimidating.

Since we have some notion of the processes by which a star burns, we know that it’s not easy to build a star from scratch. Because we understand gears, we can prove that no collection of gears obeying known physics can form a perpetual motion machine. These are not good problems on which to practice doing the impossible.

When you’re confused about a domain, problems in it will feel very intimidating and mysterious, and a query to your brain will produce a count of zero solutions. But you don’t know how much work will be left when the confusion clears. Dissolving the confusion may itself be a very difficult challenge, of course. But the word “impossible” should hardly be used in that connection. Confusion exists in the map, not in the territory.

So if you spend a few years working on an impossible problem, and you manage to avoid or climb out of blind alleys, and your native ability is high enough to make progress, then, by golly, after a few years it may not seem so impossible after all.

But if something seems impossible, you won’t try.

Now that’s a vicious cycle.

If I hadn’t been in a sufficiently driven frame of mind that “forty years and a Manhattan Project” just meant we should get started earlier, I wouldn’t have tried. I wouldn’t have stuck to the problem. And I wouldn’t have gotten a chance to become less intimidated.

I’m not ordinarily a fan of the theory that opposing biases can cancel each other out, but sometimes it happens by luck. If I’d seen that whole mountain at the start—if I’d realized at the start that the problem was not to build a seed capable of improving itself, but to produce a provably correct Friendly AI—then I probably would have burst into flames.

Even so, part of understanding those above-average scientists who constitute the bulk of AGI researchers is realizing that they are not driven to take on a nearly impossible problem even if it takes them 40 years. By and large, they are there because they have found the Key to AI that will let them solve the problem without such tremendous difficulty, in just five years.

Richard Hamming used to go around asking his fellow scientists two questions: “What are the important problems in your field?,” and, “Why aren’t you working on them?”

Often the important problems look Big, Scary, and Intimidating. They don’t promise 10 publications a year. They don’t promise any progress at all. You might not get any reward after working on them for a year, or five years, or ten years.

And not uncommonly, the most important problems in your field are impossible. That’s why you don’t see more philosophers working on reductionist decompositions of consciousness.

Trying to do the impossible is definitely not for everyone. Exceptional talent is only the ante to sit down at the table. The chips are the years of your life. If wagering those chips and losing seems like an unbearable possibility to you, then go do something else. Seriously. Because you can lose.

I’m not going to say anything like, “Everyone should do something impossible at least once in their lifetimes, because it teaches an important lesson.” Most of the people all of the time, and all of the people most of the time, should stick to the possible.

Never give up? Don’t be ridiculous. Doing the impossible should be reserved for very special occasions. Learning when to lose hope is an important skill in life.

But if there’s something you can imagine that’s even worse than wasting your life
, if there’s something you want that’s more important than thirty chips, or if there are scarier things than a life of inconvenience, then you may have cause to attempt the impossible.

There’s a good deal to be said for persevering through difficulties; but one of the things that must be said of it is that it does keep things difficult. If you can’t handle that, stay away! There are easier ways to obtain glamor and respect. I don’t want anyone to read this and needlessly plunge headlong into a life of permanent difficulty.

But to conclude: The “perseverance” that is required to work on important problems has a component beyond working 14 hours a day.

It’s strange, the pattern of what we notice and don’t notice about ourselves. This selectivity isn’t always about inflating your self-image. Sometimes it’s just about ordinary salience.

To keep working was a constant struggle for me, so it was salient: I noticed that I couldn’t work for 14 solid hours a day. It didn’t occur to me that “perseverance” might also apply at a timescale of seconds or years. Not until I saw people who instantly declared “impossible” anything they didn’t want to try, or saw how reluctant they were to take on work that looked like it might take a couple of decades instead of “five years.”

That was when I realized that “perseverance” applied at multiple time scales. On the timescale of seconds, perseverance is to “not to give up instantly at the very first sign of difficulty.” On the timescale of years, perseverance is to “keep working on an insanely difficult problem even though it’s inconvenient and you could be getting higher personal rewards elsewhere.”

To do things that are very difficult or “impossible,”

First you have to not run away. That takes seconds.

Then you have to work. That takes hours.

Then you have to stick at it. That takes years.

Of these, I had to learn to do the first reliably instead of sporadically; the second is still a constant struggle for me; and the third comes naturally.

*

309

Make an Extraordinary Effort

It is essential for a man to strive with all his heart, and to understand that it is difficult even to reach the average if he does not have the intention of surpassing others in whatever he does.

—Budo Shoshinshu1

In important matters, a “strong” effort usually results in only mediocre results. Whenever we are attempting anything truly worthwhile our effort must be as if our life is at stake, just as if we were under a physical attack! It is this extraordinary effort—an effort that drives us beyond what we thought we were capable of—that ensures victory in battle and success in life’s endeavors.

—Flashing Steel: Mastering Eishin-Ryu Swordsmanship2

“A ‘strong’ effort usually results in only mediocre results”—I have seen this over and over again. The slightest effort suffices to convince ourselves that we have done our best.

There is a level beyond the virtue of tsuyoku naritai (“I want to become stronger”). Isshoukenmei was originally the loyalty that a samurai offered in return for his position, containing characters for “life” and “land.” The term evolved to mean “make a desperate effort”: Try your hardest, your utmost, as if your life were at stake. It was part of the gestalt of bushido, which was not reserved only for fighting. I’ve run across variant forms issho kenmei and isshou kenmei; one source indicates that the former indicates an all-out effort on some single point, whereas the latter indicates a lifelong effort.

I try not to praise the East too much, because there’s a tremendous selectivity in which parts of Eastern culture the West gets to hear about. But on some points, at least, Japan’s culture scores higher than America’s. Having a handy compact phrase for “make a desperate all-out effort as if your own life were at stake” is one of those points. It’s the sort of thing a Japanese parent might say to a student before exams—but don’t think it’s cheap hypocrisy, like it would be if an American parent made the same statement. They take exams very seriously in Japan.

Every now and then, someone asks why the people who call themselves “rationalists” don’t always seem to do all that much better in life, and from my own history the answer seems straightforward: It takes a tremendous amount of rationality before you stop making stupid damn mistakes.

As I’ve mentioned a couple of times before: Robert Aumann, the Nobel laureate who first proved that Bayesians with the same priors cannot agree to disagree, is a believing Orthodox Jew. Surely he understands the math of probability theory, but that is not enough to save him. What more does it take? Studying heuristics and biases? Social psychology? Evolutionary psychology? Yes, but also it takes isshoukenmei, a desperate effort to be rational—to rise above the level of Robert Aumann.

Sometimes I do wonder if I ought to be peddling rationality in Japan instead of the United States—but Japan is not preeminent over the United States scientifically, despite their more studious students. The Japanese don’t rule the world today, though in the 1980s it was widely suspected that they would (hence the Japanese asset bubble). Why not?

In the West, there is a saying: “The squeaky wheel gets the grease.”

In Japan, the corresponding saying runs: “The nail that sticks up gets hammered down.”

This is hardly an original observation on my part: but entrepreneurship, risk-taking, leaving the herd, are still advantages the West has over the East. And since Japanese scientists are not yet preeminent over American ones, this would seem to count for at least as much as desperate efforts.

Anyone who can muster their willpower for thirty seconds can make a desperate effort to lift more weight than they usually could. But what if the weight that needs lifting is a truck? Then desperate efforts won’t suffice; you’ll have to do something out of the ordinary to succeed. You may have to do something that you weren’t taught to do in school. Something that others aren’t expecting you to do, and might not understand. You may have to go outside your comfortable routine, take on difficulties you don’t have an existing mental program for handling, and bypass the System.

This is not included in isshokenmei, or Japan would be a very different place.

So then let us distinguish between the virtues “make a desperate effort” and “make an extraordinary effort.”

And I will even say: The second virtue is higher than the first.

The second virtue is also more dangerous. If you put forth a desperate effort to lift a heavy weight, using all your strength without restraint, you may tear a muscle. Injure yourself, even permanently. But if a creative idea goes wrong, you could blow up the truck and any number of innocent bystanders. Think of the difference between a businessperson making a desperate effort to generate profits, because otherwise they must go bankrupt; versus a businessperson who goes to extraordinary lengths to profit, in order to conceal an embezzlement that could send them to prison. Going outside the system isn’t always a good thing.

A friend of my little brother’s once came over to my parents’ house, and wanted to play a game—I entirely forget which one, except that it had complex but well-designed rules. The friend wanted to change the rules, not for any particular reason, but on the general principle that playing by the ordinary rules of anything was too boring. I said to him: “Don’t violate rules for the sake of violating them. If you break the rules only when you have an overwhelmingly good reason to do so, you will have more than enough trouble to last you the rest of your life.”

Even so, I think that we could do with more appreciation of the virtue “make an extraordinary effort.” I’ve lost count of how many people have said to me something like: “It’s futile to work on Friendly AI, because the first AIs will be built by powerful corporations and they will only care about maximizing profits.” “It’s futile to work on Friendly AI, the first AIs will be built by the military as weapons.” And I’m standing there thinking: Does it even occur to them that this might be a time to try for something other tha
n the default outcome? They and I have different basic assumptions about how this whole AI thing works, to be sure; but if I believed what they believed, I wouldn’t be shrugging and going on my way.

Or the ones who say to me: “You should go to college and get a Master’s degree and get a doctorate and publish a lot of papers on ordinary things—scientists and investors won’t listen to you otherwise.” Even assuming that I tested out of the bachelor’s degree, we’re talking about at least a ten-year detour in order to do everything the ordinary, normal, default way. And I stand there thinking: Are they really under the impression that humanity can survive if every single person does everything the ordinary, normal, default way?

I am not fool enough to make plans that depend on a majority of the people, or even 10% of the people, being willing to think or act outside their comfort zone. That’s why I tend to think in terms of the privately funded “brain in a box in a basement” model. Getting that private funding does require a tiny fraction of humanity’s six billions to spend more than five seconds thinking about a non-prepackaged question. As challenges posed by Nature go, this seems to have a kind of awful justice to it—that the life or death of the human species depends on whether we can put forth a few people who can do things that are at least a little extraordinary. The penalty for failure is disproportionate, but that’s still better than most challenges of Nature, which have no justice at all. Really, among the six billion of us, there ought to be at least a few who can think outside their comfort zone at least some of the time.

‹ Prev Next ›