Skip to main content

Beyond Chat GPT, Russell on AI


 Stuart Russell gives both an informed and easy to follow set of responses to questions raised at the Common Wealth Club of California. There is little new, in order to add to my understanding, in what he states here, but he does put it forward concisely. 

Stuart Russell is a Professor of Computer Science, Director of the Kavli Center for Ethics, Science, and the Public, and Director of the Center for Human-Compatible AI, University of California, Berkeley; Author, Human Compatible: Artificial Intelligence and the Problem of Control.

An example of Russel's thought is below:

'But the drawback in doing that is that we have to specify those objectives, right? The machines don't dream them up by themselves. And if we mis-specify the objectives, then we have what's called a misalignment between the machine's behaviour and what humans want the future to be like.

The most obvious example of that is in social media, where we have specified objectives like maximising the number of clicks and maximising the amount of engagement of the user. The machine learning algorithms that decide what billions of people read and watch have more control over human cognitive intake than any dictator, you know, than the North Korean or Stalin or anyone has ever had.

And yet they're totally unregulated. So those algorithms learn how to maximise those objectives, and they figured out that the best way to do it is not to send you what you're interested in, but actually to manipulate you over time by thousands of little nudges so that you become a much more predictable version of yourself.

Because the more predictable you are, the more they can monetise you. And so they learned how to do that. And at least empirically, it looks as if the best way to do that is to make you more extreme, right? So that you start to consume that red meat that then whole human industries spring up to feed.

And this misalignment is the source of the concern that people have had about AI. Going right back to Alan Turing, who was the founder of computer science, in a 1951 lecture. He said that once the machine-thinking method had started, it would leave our feeble powers far behind. And we should have to expect the machines to take control.

So they take control not because they're evil or because they spontaneously develop consciousness or anything like that. It's just because we give them some objectives that are not aligned with what we want the future to be like. And because they're more capable than us, they achieve their objectives and we don't, right? So we set up a chess match which we proceed to lose.

So in order to fix that problem, I've been following a different approach to AI, which says that the AI system, while its only objective is to further the interests of human beings, doesn't know what those are and knows that it doesn't know what those are. It's explicitly uncertain about human objectives. And so to the extent that there's a moral theory, it's simply that the job of an AI system is to further human interest. It knows that it doesn't know what those are, but it can learn more by conversing with us, by observing the choices that we make and the choices that we regret, the things that we do and the things that we don't do. So this helps it to understand what we want the future to be like. And then as it starts to learn, it can start to be more helpful.

There are still some difficult moral questions, of course. The most obvious one is that it's not one person's interest. It's not one set of values. There's 8 billion of us, so there's 8 billion different preferences about the future and how do you trade those off? And this is a two-and-a-half-thousand-year-old question, at least, and there are several different schools of thought on that. And we better figure out which is the right one, because we're going to be implementing it fairly soon.

And then there are even more difficult questions like, well, what about not the 8 billion people who are alive, but what about all the people who have yet to live? How do we take into account their interests? Right, right. What if we take actions that change who's going to live? You change the number of people who are going to live. For example, the Chinese policy of one child per family probably eliminated 500 million people already. Now they never existed. So we don't know what they would have wanted, but how, you know, how should we make that type of decision? Right. These are really difficult questions that philosophers really struggle with. But when we have AI systems that are sufficiently powerful that they could make those decisions, we need to have an answer ready so that we don't get it wrong.'

Comments

Popular posts from this blog

OpenAI's NSA Appointment Raises Alarming Surveillance Concerns

  The recent appointment of General Paul Nakasone, former head of the National Security Agency (NSA), to OpenAI's board of directors has sparked widespread outrage and concern among privacy advocates and tech enthusiasts alike. Nakasone, who led the NSA from 2018 to 2023, will join OpenAI's Safety and Security Committee, tasked with enhancing AI's role in cybersecurity. However, this move has raised significant red flags, particularly given the NSA's history of mass surveillance and data collection without warrants. Critics, including Edward Snowden, have voiced their concerns that OpenAI's AI capabilities could be leveraged to strengthen the NSA's snooping network, further eroding individual privacy. Snowden has gone so far as to label the appointment a "willful, calculated betrayal of the rights of every person on Earth." The tech community is rightly alarmed, with many drawing parallels to dystopian fiction. The move has also raised questions about ...

What is happening inside of the black box?

  Neel Nanda is involved in Mechanistic Interpretability research at DeepMind, formerly of AnthropicAI, what's fascinating about the research conducted by Nanda is he gets to peer into the Black Box to figure out how different types of AI models work. Anyone concerned with AI should understand how important this is. In this video Nanda discusses some of his findings, including 'induction heads', which turn out to have some vital properties.  Induction heads are a type of attention head that allows a language model to learn long-range dependencies in text. They do this by using a simple algorithm to complete token sequences like [A][B] ... [A] -> [B]. For example, if a model is given the sequence "The cat sat on the mat," it can use induction heads to predict that the word "mat" will be followed by the word "the". Induction heads were first discovered in 2022 by a team of researchers at OpenAI. They found that induction heads were present in ...

Prompt Engineering: Expert Tips for a variety of Platforms

  Prompt engineering has become a crucial aspect of harnessing the full potential of AI language models. Both Google and Anthropic have recently released comprehensive guides to help users optimise their prompts for better interactions with their AI tools. What follows is a quick overview of tips drawn from these documents. And to think just a year ago there were countless YouTube videos that were promoting 'Prompt Engineering' as a job that could earn megabucks... The main providers of these 'chatbots' will hopefully get rid of this problem, soon. Currently their interfaces are akin to 1970's command lines, we've seen a regression in UI. Constructing complex prompts should be relegated to Linux lovers. Just a word of caution, even excellent prompts don't stop LLM 'hallucinations'. They can be mitigated against by supplementing a LLM with a RAG, and perhaps by 'Memory Tuning ' as suggested by Lamini (I've not tested this approach yet).  ...