Stuart Russell gives both an informed and easy to follow set of responses to questions raised at the Common Wealth Club of California. There is little new, in order to add to my understanding, in what he states here, but he does put it forward concisely.
Stuart Russell is a Professor of Computer Science, Director of the Kavli Center for Ethics, Science, and the Public, and Director of the Center for Human-Compatible AI, University of California, Berkeley; Author, Human Compatible: Artificial Intelligence and the Problem of Control.
An example of Russel's thought is below:
'But the drawback in doing that is that we have to specify those objectives, right? The machines don't dream them up by themselves. And if we mis-specify the objectives, then we have what's called a misalignment between the machine's behaviour and what humans want the future to be like.
The most obvious example of that is in social media, where we have specified objectives like maximising the number of clicks and maximising the amount of engagement of the user. The machine learning algorithms that decide what billions of people read and watch have more control over human cognitive intake than any dictator, you know, than the North Korean or Stalin or anyone has ever had.
And yet they're totally unregulated. So those algorithms learn how to maximise those objectives, and they figured out that the best way to do it is not to send you what you're interested in, but actually to manipulate you over time by thousands of little nudges so that you become a much more predictable version of yourself.
Because the more predictable you are, the more they can monetise you. And so they learned how to do that. And at least empirically, it looks as if the best way to do that is to make you more extreme, right? So that you start to consume that red meat that then whole human industries spring up to feed.
And this misalignment is the source of the concern that people have had about AI. Going right back to Alan Turing, who was the founder of computer science, in a 1951 lecture. He said that once the machine-thinking method had started, it would leave our feeble powers far behind. And we should have to expect the machines to take control.
So they take control not because they're evil or because they spontaneously develop consciousness or anything like that. It's just because we give them some objectives that are not aligned with what we want the future to be like. And because they're more capable than us, they achieve their objectives and we don't, right? So we set up a chess match which we proceed to lose.
So in order to fix that problem, I've been following a different approach to AI, which says that the AI system, while its only objective is to further the interests of human beings, doesn't know what those are and knows that it doesn't know what those are. It's explicitly uncertain about human objectives. And so to the extent that there's a moral theory, it's simply that the job of an AI system is to further human interest. It knows that it doesn't know what those are, but it can learn more by conversing with us, by observing the choices that we make and the choices that we regret, the things that we do and the things that we don't do. So this helps it to understand what we want the future to be like. And then as it starts to learn, it can start to be more helpful.
There are still some difficult moral questions, of course. The most obvious one is that it's not one person's interest. It's not one set of values. There's 8 billion of us, so there's 8 billion different preferences about the future and how do you trade those off? And this is a two-and-a-half-thousand-year-old question, at least, and there are several different schools of thought on that. And we better figure out which is the right one, because we're going to be implementing it fairly soon.
And then there are even more difficult questions like, well, what about not the 8 billion people who are alive, but what about all the people who have yet to live? How do we take into account their interests? Right, right. What if we take actions that change who's going to live? You change the number of people who are going to live. For example, the Chinese policy of one child per family probably eliminated 500 million people already. Now they never existed. So we don't know what they would have wanted, but how, you know, how should we make that type of decision? Right. These are really difficult questions that philosophers really struggle with. But when we have AI systems that are sufficiently powerful that they could make those decisions, we need to have an answer ready so that we don't get it wrong.'
Comments
Post a Comment