by Martin Ford
Nothing in the model of children that I was testing allowed for that level of complexity at all. This was hugely more complex than any of the models in psychology. It was an information processing system that was smart and could figure out what was going on, and for me, that was the end of psychology. The models they had were hopelessly inadequate compared with the complexity of what they were dealing with.
MARTIN FORD: After leaving the field of psychology, how did you end up going into artificial intelligence?
GEOFFREY HINTON: Well, before I moved into the world of AI, I became a carpenter, and whilst I enjoyed it, I wasn’t an expert at it. During that time, I met a really good carpenter, and it was highly depressing, so because of that I went back to academia.
MARTIN FORD: Well, given the other path that opened up for you, it’s probably a good thing that you weren’t a great carpenter!
GEOFFREY HINTON: Following my attempt at carpentry, I worked as a research assistant on a psychology project trying to understand how language develops in very young children, and how it is influenced by social class. I was responsible for creating a questionnaire that would assess the attitude of the mother toward their child’s language development. I cycled out to a very poor suburb of Bristol, and I knocked on the door of the first mother I was due to talk to. She invited me in and gave me a cup of tea, and then I asked her my first question, which was: “What’s your attitude towards your child’s use of language?” She replied, “If he uses language, we hit him.” So that was pretty much it for my career as a social psychologist.
After that I went into AI and became a graduate student in artificial intelligence at The University of Edinburgh. My adviser was a very distinguished scientist called Christopher Longuet-Higgins who’d initially been a professor of chemistry at Cambridge and had then switched fields to artificial intelligence. He was very interested in how the brain might work—and in particular, studying things like holograms. He had realized that computer modeling was the way to understand the brain, and he was working on that, and that’s why I originally signed up with him. Unfortunately for me, about the same time that I signed up with him, he changed his mind. He decided that these neural models were not the way to understand intelligence, and the actual way to understand intelligence was to try and understand language.
It’s worth remembering that at the time, there were some impressive models—using symbol processing—of systems that could talk about arrangements of blocks. An American professor of computer science called Terry Winograd wrote a very nice thesis that showed how you could get a computer to understand some language and to answer questions, and it would actually follow commands. You could say to it, “put the block that’s in the blue box on top of the red cube,” and it would understand and do that. It was only in a simulation, but it would understand the sentence. That impressed Christopher Longuet-Higgins a lot, and he wanted me to work on that, but I wanted to keep working on neural networks.
Now, Christopher was a very honorable guy, but we completely disagreed on what I should do. I kept refusing to do what he said, but he kept me on anyway. I continued my work on neural networks, and eventually, I did a thesis on neural networks, though at the time, neural networks didn’t work very well and there was a consensus that they were just nonsense.
MARTIN FORD: When was this in relation to Marvin Minsky and Seymour Papert’s Perceptrons book?
GEOFFREY HINTON: This was in the early ‘70s, and Minsky and Papert’s book came out in the late ‘60s. Almost everybody in artificial intelligence thought that was the end of neural networks. They thought that trying to understand intelligence by studying neural networks was like trying to understand intelligence by studying transistors; it just wasn’t the way to do it. They thought intelligence was all about programs, and you had to understand what programs the brain was using.
These two paradigms were completely different, they aimed to try and solve different problems, and they used completely different methods and different kinds of mathematics. Back then, it wasn’t at all clear which was going to be the winning paradigm. It’s still not clear to some people today.
What was interesting, was that some of the people most associated with logic actually believed in the neural net paradigm. The biggest examples are John von Neumann and Alan Turing, who both thought that big networks of simulated neurons were a good way to study intelligence and figure out how those things work. However, the dominant approach in AI was symbol processing inspired by logic. In logic, you take symbol strings and alter them to arrive at new symbol strings, and people thought that must be how reasoning works.
They thought neural nets were far too low-level, and that they were all about implementation, just like how transistors are the implementation layer in a computer. They didn’t think you could understand intelligence by looking at how the brain is implemented, they thought you could only understand it by looking at intelligence in itself, and that’s what the conventional AI approach was.
I think it was disastrously wrong, something that we’re now seeing. The success of deep learning is showing that the neural net paradigm is actually far more successful than the logic-based paradigm, but back then in the 1970s, that was not what people thought.
MARTIN FORD: I’ve seen a lot of articles in the press suggesting deep learning is being overhyped, and this hype could lead to disappointment and then less investment, and so forth. I’ve even seen the phrase “AI Winter” being used. Is that a real fear? Is this potentially a dead end, or do you think that neural networks are the future of AI?
GEOFFREY HINTON: In the past when AI has been overhyped—including backpropagation in the 1980s—people were expecting it to do great things, and it didn’t actually do things as great as they hoped. Today, it’s already done great things, so it can’t possibly all be just hype. It’s how your cell phone recognizes speech, it’s how a computer can recognize things in photos, and it’s how Google does machine translation. Hype means you’re making big promises, and you’re not going to live up to them, but if you’ve already achieved them, that’s clearly not hype.
I occasionally see an advertisement on the web that says it’s going to be a 19.9 trillion-dollar industry. That seems like rather a big number, and that might be hype, but the idea that it’s a multi-billion-dollar industry clearly isn’t hype, because multiple people have put billions of dollars into it and it’s worked for them.
MARTIN FORD: Do you believe the best strategy going forward is to continue to invest exclusively in neural networks? Some people still believe in symbolic AI, and they think there’s potentially a need for a hybrid approach that incorporates both deep learning and more traditional approaches. Would you be open to that, or do you think the field should focus only on neural networks?
GEOFFREY HINTON: I think big vectors of neural activities interacting with each other is how the brain works, and it’s how AI is going to work. We should definitely try and figure out how the brain does reasoning, but I think that’s going to come fairly late compared with other things.
I don’t believe hybrid systems are the answer. Let’s use the car industry as an analogy. There are some good things about a petrol engine, like you can carry a lot of energy in a small tank, but there are also some really bad things about petrol engines. Then there are electric motors, which have a lot to be said in their favor compared with petrol engines. Some people in the car industry agreed that electrical engines were achieving progress and then said they’d make a hybrid system and use the electric motor to inject the petrol into the engine. That’s how people in conventional AI are thinking. They have to admit that deep learning is doing amazing things, and they want to use deep learning as a kind of low-level servant to provide them with what they need to make their symbolic reasoning work. It’s just an attempt to hang on to the view they already have, without really comprehending that they’re being swept away.
MARTIN FORD: Thinking more in terms of the future of the field, I know your la
test project is something you’re calling Capsules, which I believe is inspired by the columns in the brain. Do you feel that it’s important to study the brain and be informed by that, and to incorporate those insights into what you’re doing with neural networks?
GEOFFREY HINTON: Capsules is a combination of half a dozen different ideas, and it’s complicated and speculative. So far, it’s had some small successes, but it’s not guaranteed to work. It’s probably too early to talk about that in detail, but yes, it is inspired by the brain.
When people talk about using neuroscience in neural networks, most people have a very naive idea of science. If you’re trying to understand the brain, there’s going to be some basic principles, and there’s going to be a whole lot of details. What we’re after is the basic principles, and we expect the details all to be very different if we use different kinds of hardware. The hardware we have in graphics processor units (GPUs) is very different from the hardware in the brain, and one might expect lots of differences, but we can still look for principles. An example of a principle is that most of the knowledge in your brain comes from learning, it doesn’t come from people telling you facts that you then store as facts.
With conventional AI, people thought that you have this big database of facts. You also have some rules of inference. If I want to give you some knowledge, what I do is simply express one of these facts in some language and then transplant it into your head, and now you have the knowledge. That’s completely different from what happens in neural networks: You have a whole lot of parameters in your head, that is weights of connections between neurons, and I have a whole lot of weights of connections between the neurons in my head, and there’s no way that you can give me your connection strengths. Anyway, they wouldn’t be any use to me because my neural network’s not exactly the same as yours. What you have to do is somehow convey information about how you are working so that I can work the same way, and you do that by giving me examples of inputs and outputs.
For example, if you look at a tweet from Donald Trump, it’s a big mistake to think that what Trump is doing is conveying facts. That’s not what he’s doing. What Trump is doing is saying that given a particular situation, here’s a way you might choose to respond. A Trump follower can then see the situation, they can see how Trump thinks they ought to respond, and they can learn to respond the same way as Trump. It’s not that some proposition is being conveyed from Trump to the follower, it’s that a way of reacting to things has been conveyed by example. That’s very different from a system that has a big store of facts, and you can copy facts from one system to another.
MARTIN FORD: Is it true that the vast majority of applications of deep learning rely heavily on labeled data, or what’s called supervised learning, and that we still need to solve unsupervised learning?
GEOFFREY HINTON: That’s not entirely true. There’s a lot of reliance on labeled data, but there are some subtleties in what counts as labeled data. For example, if I give you a big string of text and I ask you to try and predict the next word, then I’m using the next word as a label of what the right answer is, given the previous words. In that sense, it’s labeled, but I didn’t need an extra label over and above the data. If I give you an image and you want to recognize cats, then I need to give you a label “cat,” and the label “cat” is not part of the image. I’m having to create these extra labels, and that’s hard work.
If I’m just trying to predict what happens next, that’s supervised learning because what happens next acts as the label, but I don’t need to add extra labels. There’s this thing in between unlabeled data and labeled data, which is predicting what comes next.
MARTIN FORD: If you look at the way a child learns, though, it’s mostly wandering around the environment and learning in a very unsupervised way.
GEOFFREY HINTON: Going back to what I just said, the child is wandering around the environment trying to predict what happens next. Then when what happens next comes along, that event is labeled to tell it whether it got it right or not. The point is, with both those terms, “supervised” and “unsupervised,” it’s not clear how you apply them to predicting what happens next.
There’s a nice clear case of supervised learning, which is that I give you an image and I give you the label “cat,” then you have to say it’s a cat, then there’s a nice clear case of unsupervised learning, which is if I give you a bunch of images, and you have to build representations of what’s going on in the images. Finally, there’s something that doesn’t fall simply into either camp, which is if I give you a sequence of images and you have to predict the next image. It’s not clear in that case whether you should call that supervised learning or unsupervised learning, and that causes a lot of confusion.
MARTIN FORD: Would you view solving a general form of unsupervised learning as being one of the primary obstacles that needs to be overcome?
GEOFFREY HINTON: Yes. But in that sense, one form of unsupervised learning is predicting what happens next, and my point is that you can apply supervised learning algorithms to do that.
MARTIN FORD: What do you think about AGI, and how would you define that? I would take it to mean human-level artificial intelligence, namely an AI that can reason in a general way, like a human. Is that your definition, or would you say it’s something else?
GEOFFREY HINTON: I’m happy with that definition, but I think people have various assumptions of what the future’s going to look like. People think that we’re going to get individual AIs that get smarter and smarter, but I think there are two things wrong with that picture. One is that deep learning, or neural networks are going to get much better than us at some things, while they’re still quite a lot worse than us at other things. It’s not like they’re going to get uniformly better at everything. They’re going to be much better, for example, at interpreting medical images, while they’re still a whole lot worse at reasoning about them. In that sense, it’s not going to be uniform.
The second thing that’s wrong is that people always think about it as individual AIs, and they ignore the social aspect of it. Just for pure computational reasons, making very advanced intelligence is going to involve making communities of intelligent systems because a community can see much more data than an individual system. If it’s all a question of seeing a lot of data, then we’re going to have to distribute that data across lots of different intelligent systems and have them communicate with one another so that between them, as a community, they can learn from a huge amount of data meaning that in the future, the community aspect of it is going to be essential.
MARTIN FORD: Do you envision it as being an emergent property of connected intelligences on the internet?
GEOFFREY HINTON: No, it’s the same with people. The reason that you know most of what you know is not because you yourself extracted that information from data, it’s because other people, over many years, have extracted information from data. They then gave you training experiences that allowed you to get to the same understanding without having to do the raw extraction from data. I think it’ll be like that with artificial intelligence too.
MARTIN FORD: Do you think AGI, whether it’s an individual system or a group of systems that interact, is feasible?
GEOFFREY HINTON: Oh, yes. I mean OpenAI already has something that plays quite sophisticated computer games as a team.
MARTIN FORD: When do you think it might be feasible for an artificial intelligence, or a group of AIs that come together, to have the same reasoning, intelligence, and capability as a human being?
GEOFFREY HINTON: If you go for reasoning, I think that’s going to be one of the things we get really good at later on, but it’s going to be quite a long time before big neural networks are really as good as people at reasoning. That being said, they’ll be better at all sorts of other things before we get to that point.
MARTIN FORD: What about for a holistic AGI, though, where a computer system’s intelligence is as good as a person?
GEOFF
REY HINTON: I think there’s a presupposition that the way AIs can develop is by making individuals that are general-purpose robots like you see on Star Trek. If your question is, “When are we going to get a Commander Data?”, then I don’t think that’s how things are going to develop. I don’t think we’re going to get single, general-purpose things like that. I also think, in terms of general reasoning capacity, it’s not going to happen for quite a long time.
MARTIN FORD: Think of it in terms of passing the Turing test, and not for five minutes but for two hours, so that you can have a wide-ranging conversation that’s as good as a human being. Is that feasible, whether it’s one system or some community of systems?
GEOFFREY HINTON: I think there’s a reasonable amount of probability that it will happen in somewhere between 10 and 100 years. I think there’s a very small probability, it’ll happen before the end of the next decade, and I think there’s also a big probability that humanity gets wiped out by other things before the next 100 years occurs.
MARTIN FORD: Do you mean through other existential threats like a nuclear war or a plague?
GEOFFREY HINTON: Yes, I think so. In other words, I think there are two existential threats that are much bigger than AI. One is global nuclear war, and the other is a disgruntled graduate student in a molecular biology lab making a virus that’s extremely contagious, extremely lethal, and has a very long incubation time. I think that’s what people should be worried about, not ultra-intelligent systems.
MARTIN FORD: Some people, such as Demis Hassabis at DeepMind, do believe that they can build the kind of system that you’re saying you don’t think is going to come into existence. How do you view that? Do you think that it is a futile task?