The way you talk about curve fitting, it sounds like you’re not very impressed with machine learning.
No, I’m very impressed, because we did not expect that so many problems could be solved by pure curve fitting. It turns out they can. But I’m asking about the future — what next? Can you have a robot scientist that would plan an experiment and find new answers to pending scientific questions? That’s the next step. We also want to conduct some communication with a machine that is meaningful, and meaningful means matching our intuition. If you deprive the robot of your intuition about cause and effect, you’re never going to communicate meaningfully. Robots could not say “I should have done better,” as you and I do. And we thus lose an important channel of communication.
What are the prospects for having machines that share our intuition about cause and effect?
We have to equip machines with a model of the environment. If a machine does not have a model of reality, you cannot expect the machine to behave intelligently in that reality. The first step, one that will take place in maybe 10 years, is that conceptual models of reality will be programmed by humans.
The next step will be that machines will postulate such models on their own and will verify and refine them based on empirical evidence. That is what happened to science; we started with a geocentric model, with circles and epicycles, and ended up with a heliocentric model with its ellipses.
Robots, too, will communicate with each other and will translate this hypothetical world, this wild world, of metaphorical models.
When you share these ideas with people working in AI today, how do they react?
AI is currently split. First, there are those who are intoxicated by the success of machine learning and deep learning and neural nets. They don’t understand what I’m talking about. They want to continue to fit curves. But when you talk to people who have done any work in AI outside statistical learning, they get it immediately. I have read several papers written in the past two months about the limitations of machine learning.
Are you suggesting there’s a trend developing away from machine learning?
Not a trend, but a serious soul-searching effort that involves asking: Where are we going? What’s the next step?
That was the last thing I wanted to ask you.
I’m glad you didn’t ask me about free will.
In that case, what do you think about free will?
We’re going to have robots with free will, absolutely. We have to understand how to program them and what we gain out of it. For some reason, evolution has found this sensation of free will to be computationally desirable.
In what way?
You have the sensation of free will; evolution has equipped us with this sensation. Evidently, it serves some computational function.
Will it be obvious when robots have free will?
I think the first evidence will be if robots start communicating with each other counterfactually, like “You should have done better.” If a team of robots playing soccer starts to communicate in this language, then we’ll know that they have a sensation of free will. “You should have passed me the ball — I was waiting for you and you didn’t!” “You should have” means you could have controlled whatever urges made you do what you did, and you didn’t. So the first sign will be communication; the next will be better soccer.
Now that you’ve brought up free will, I guess I should ask you about the capacity for evil, which we generally think of as being contingent upon an ability to make choices. What is evil?
It’s the belief that your greed or grievance supersedes all standard norms of society. For example, a person has something akin to a software module that says “You are hungry, therefore you have permission to act to satisfy your greed or grievance.” But you have other software modules that instruct you to follow the standard laws of society. One of them is called compassion. When you elevate your grievance above those universal norms of society, that’s evil.
So how will we know when AI is capable of committing evil?
When it is obvious for us that there are software components that the robot ignores, consistently ignores. When it appears that the robot follows the advice of some software components and not others, when the robot ignores the advice of other components that are maintaining norms of behavior that have been programmed into them or are expected to be there on the basis of past learning. And the robot stops following them.