Gillian Hadfield - Building an Off Switch for AI

Transcript

So I'm going to talk about off switches, but I am talking about a governance solution here.

I wanted to put it in the frame of this: This community has been talking about off switches for quite some time,

this paper from 2017 which is largely in response to the pushback on saying there's no AI safety problem for us to think about because we have an off switch,

we just need to unplug it if the machines start doing things wrong. And so people have been thinking in deep ways from a technical perspective about

the obstacles to just turning machines off. But the idea that there is no off switch has started to become quite widespread.

I found in conversations - particularly over the last year, shall we say - the idea that we really can't stop things.

This is from Ezra Klein's column from just earlier in November.

and saying one of his lessons from the upheaval at OpenAI was to say that--

real concern that we really just don't have the capacity if you thought of the nonprofit governance board as an off switch, right.

We could turn things off if we thought it was unsafe. He's taking the lesson to say,

one of the things we've learned is that that's a bit of a paper tiger when you got billions of dollars on the other side.

So concluded this column with the ominous, there is no off switch. There again, you sort of see lots of people saying it's really just inevitable.

the nature of competition, the nature of geopolitics. you just cannot stop it.

Jurgen Schmidhuber is saying here, surely not on an international level. I hear a lot of comments like this because,

you know, even if one country wants to do that, other countries may have different goals.

So there's a real throwing up of hands. So this is also thinking about the response to the pause letter.

and Geoff Hinton gave a talk that we sponsored earlier in October this year at the University of Toronto,

on the question of will digital intelligence replace biological intelligence? His answer:

Yes.. or at least my best guess is they will take over. They'll be much more intelligent than people ever were,

and that we will just be a passing stage. That's my best guess and I hope I'm wrong but there's

a strong sense of inevitability that there's no way to slow things down. There's no way to stop things.

And so what I want to talk about is the idea that maybe there are some things we're not taking into account.

Let's see is that, so this is the question, is global proliferation of increasingly powerful,

unsafe AI systems just inevitable. Anybody's talking about P(Doom)s,

- I guess is thinking about it - it's just a doom scenario. So I want to suggest that it's maybe not. Now,

maybe there should be a question mark after this because I don't want this to be--

I want to throw these ideas out here. There is lots of holes to poke because we have legal and regulatory tools for

turning off or at least slowing down AI that I think we're not thinking through.

And so part of what I want to bring to the conversation is just a bit more of a sophisticated nuanced

idea about how regulation and governance can work because I think we might primarily have been talking about standards,

the idea that we would impose requirements on developers. And then everybody says, yeah,

but our governments are stupid and they're weak and they won't be able to stop anybody and those bad governments

won't do it at all either. So I think there's a little, actually a little bit more hope here than we think.

So. Here's the structure of my argument. First of all, recognize that our frontier AI systems,

those big advances we're seeing are driven by private sector, market competition.

Fundamentally, we often see this as part of the problem that because it's in the private sector,

there's certainly challenges we face there because we can't see what's happening inside.

We can't study the models unless you're inside the firms. But it's being driven by private sector market competition.

Now, if that wasn't the case, if this was more like nuclear technology exclusively produced in the public sector,

we have other kinds of tools available. Now, just imagine that world where you needed the scale of government and only

the scale of government could produce these technologies. Then we're looking at agreements between countries and we have seen some successes around powerful

technologies like nuclear. But that's not the argument I want to make here. I just want to say,

let's look at the world we're in. The world we're in is that this is coming out of the private sector market competition.

And that's when I'm going to say, OK, now I'm an economist, I want to talk about some of the economics of that.

[audience member] Is your claim that you can have tools to stop it also in China, or only in the US?

Let me get through it and I'll get to China at the end. But it's I just want to develop this argument here.

So it's private sector market competition. The incentive to invest in foundation models

scales with market demand for the for applications. So,

why did we see the move from a nonprofit to a capped for-profit company for OpenAI?

It was because of the need for high levels of private sector investment, and that was something that required participating in markets.

So and the incentive to invest in foundation models, the billions that are getting poured in is because of the potential

economic value of securing those applications - and it's ginormous, right?

Because... my favorite title ever GPTs are GPTs. Generative pretrained transformers are general purpose technologies, right?

If you're in inventing intelligence, it's in everything. So think about the scale of the world economy and we're thinking about that kind.

So that is what is attracting, I mean, just unheard-of amounts of investment into and we will continue to do so.

Again, this is a market incentive to invest because there's a market at the end of the road.

If you build it, then you have this very powerful economic benefit. Now this is the thing that's really important.

We actually have tons of structure in place to control who and what participates in our markets.

This is basic economic infrastructure, legal infrastructure of markets.

And we have lots and lots of infrastructure in place to decide who can and who can't.

I'll develop this in a moment. [audience question] Who is "we"?

Oh, we humans designing economies, right? So that's right.

We are living - modern humans living in the kinds of structures that we have advanced.

Certainly, I'm thinking here, advanced market economies.

We control who and what participates in our markets.

And therefore this is the nature of the argument we have tools to reduce demand for frontier models.

If you reduce demand for frontier models, you reduce investment in frontier models.

And that's one way of affecting the pace or the event of development of frontier models.

So it's just really important to say like this isn't just scientific interest and the

idea that these ideas will just continue.

And maybe what we're seeing is actually if it was just the scientific interest, right?

We may not be able to do it because you can't, you can't afford the cost for the massive amount of compute involved.

OK. So now I want to give you some more details on what this regulatory infrastructure looks like and it's

related, it's this proposal that I've put forward with Tino Cuellar, who's president of the Carnegie Endowment for International Peace.

And Tim O'Reilly, whom I'm sure many of, from O'Reilly publications publishing.

and I will say this is an idea that I sketched out like four days after signing the pause [AI] letter because it was like,

OK, what should we do with the pause?

Here's and I was just trying to pitch it as here's a very straightforward thing you can do right now and we should do right now.

And thanks to Adam for mentioning that that was part of one of the things that came out of the International Dialogue for AI Safety at Ditchley Park was the registry.

So I'm going to walk you through this registry proposal. So very simple.

Number one, create a national registry, call it, an agency... could be an office in another department.

and this is I I'm proposing here at national that you could do it at the national level. You know, you could always get consortia of national registries,

but right now just think about the US or Canada the UK, France, Germany, whatever, creating a national registry agency.

And then that what that agency does is it establishes registration requirements and process.

OK. So it establishes what it takes to be registered. And then what you have to do to get this. Now,

you should think about here if you want to start a company, there's a registration process for registering your company.

And then this is the key step. You make it illegal. So you pass legislation that makes it illegal to sell

or buy or give away. So this is open-source models as well, the services of or the actual model itself of unregistered models.

So, basically, this is how you create a requirement that you must be a registered model.

You must register with the national registry in order to participate in the market, right?

But notice that it's not just selling that requires registration. It's also buying.

You can only buy the services of a registered model. [audience member] What's the difference between being registered and licensed here?

Is there a difference? I will get to that. So let me get to the evolution to a licensing regime.

So I want to really emphasize because it's --

so I've been talking about legal infrastructure for a couple of decades and one of the reasons I like the language of

infrastructure is because it's invisible to most people, right?

You don't pay a lot of attention to. There's all this legal structure in place.

There's no such thing as a free market. Markets exist on the basis of a ton of infrastructure that's there.

So we don't want to even think about this as regulation.

This is just basic legal infrastructure that we're currently not building for AI.

And this is what makes me most worried and feel most urgent about the fact that we need to get this message

out that we are living in a wild west that doesn't have just ordinary legal tools available to us.

We'd be in a lot better place. So by as an example comparison. In order to participate in our economies,

advanced economies, we require registration of corporations. Like you can't just start a company.

You have to go and incorporate that company and register with the whatever office registers companies in your jurisdiction.

We require registration of workers. Anybody had to fill out a W-9? Or bring your passport in to get it

photocopied by HR? Or get your visa and show your work visa in order to be able to take a job, right?

That's a registration system of workers, you know, you have to share your social security number,

your social insurance number. Those are registration schemes that allow us to say you can or cannot participate.

We require registration of hot dog carts, right? We require registration of just about everything except frontier models.

Now, here's the important point. We require people who are buying the services or

interacting, participating in transactions with corporations with workers, et cetera to verify registration.

So we legally require banks. Banks are in violation of their legal obligations...

they're subject to criminal penalties or civil penalties... We require banks and employers to verify valid registrations.

So you may know this is "know your customer" regulation for banks, right?

We require in order to make sure that criminal enterprises, child traffickers, money launderers are not using our banking systems,

we require banks to know their customers and check, for example, the status of entities they're doing business with.

Same thing with and which require employers to check those visas. Right.

So the key point here is that this is a distributed enforcement regime and one of the reasons people just feel,

oh my gosh, we can't do anything, governments are not strong enough, is because we all have in our mind,

this extremely simplistic idea about law, that law is: government sets a rule and then it sends out the police to enforce it, right?

And it throws people in jail or, or attaches huge penalties, right?

But that's a centralized enforcement mechanism and, and then we worry that it's just not strong enough or governments won't spend the money on it and so on.

But this is distributed enforcement. The government doesn't spend any money to make sure that the company is

checking that your employer is checking your work authorization, right?

Or that the bank is checking to make sure that you have valid registration if they're giving you an account.

So this is basically a pretty intelligent scheme for having a distributed and therefore much more robust enforcement mechanism.

So you - Yoshua, you talk about single points of failure. This is robust. This is the compliance department,

the lawyers that work at the bank, the lawyers that work at the company, the employer, they have an incentive to be checking for this.

And we're using that registration to control access to the enormous benefits of accessing our economic system.

That's what I mean, when I say we control who and what participates in our economies.

And those of, you know, the theoretical and more technical work that I do,

this is just a version of thinking about sort of the theory of what I call our human normative systems,

our legal systems, our systems for enforcing rules, creating rules and enforcing them.

This is basically what humans have done throughout human history. You create a group, you create value to being in the group,

you then have rules about being in the group and you enforce those rules by saying we're kicking you out of the group if you don't follow the rules.

So in that sense, I want to think about registration as an off switch. It's a lever you can pull because you can register,

you can refuse registration and you can deregister. So if you discover vulnerabilities,

you can deregister, you can require that and that and by doing that,

you're cutting off access to markets including banks and financing.

And that's choking off the incentive to invest. And I just think it's, it's an interesting analogue.

So, you know, here's the criminal fraud case against Donald Trump.

If he's convicted of fraud, then one of the penalties is going to be the deregistration of his company,

the decertification. And he won't be allowed to, and his family won't be allowed to participate in the economic system

in New York. They won't be able to apply for loans, they won't be able to serve as directors and officers and all of his companies basically will no longer

be controlled by him. They will go into public oversight, right? So that's sort of cutting off access,

And that will be something that will be enforced by all those financial institutions and say no,

I'm not allowed to do business with you. So initial registration and the reason I sort of wrote this up quickly at the

beginning to say we could do this very quickly and it can be very, very low cost to,

to get in place because you could just require disclosure, confidential disclosure of model attributes in order to be a registered model.

I think that's actually the wise and prudent thing to do here to build this infrastructure because we

also do not know yet what the risks are, what kinds of things,

what kinds of limits, what kinds of tests we want to put in place.

But we shouldn't be waiting around to figure,

get to solve the problem of AI alignment, control, and

safety to then build the system. This is disclosure of model attributes.

So training data, compute methods, model size, known capabilities, test red team results.

What safeguards are in place? Basically to get that information into the registry for

all the frontier models that want to participate in your economy.

So if this is the US, it would be in the US economy.

And I like to think of this as a seed crystal for growing an agency,

agency expertise for figuring out what it is we might need to do next.

So one of the things that I think is just, well,

quite stunning about where we are on the regulatory front is the fact that our governments don't have

visibility into what's being built. And it, you have to be inside a company to know what,

what thing, you know, what was the actual training data, what were the methods,

what's the size? What do we know about it? Anybody who's inside the company?

I don't know how many people inside know the full picture,

but this is the idea that no, our government should have that information.

It shouldn't be out on the internet, it should be confidential disclosure just to the,

to the agency. But my hope is also that gives you an opportunity to,

to build some expertise. And then we can be discovering what are the

additional requirements that we might need to impose?

Like straight off the bat everybody talks about it,

but you don't have a mechanism to implement the idea.

You have prohibited uses and prohibited users.

Like you can't sell it to North Korea or you can't sell it to the same type of entities

you can't make loans to. Criminal, criminal enterprises, cybercrime groups and so on.

We could, evolving ideas about requirements on size or data, methods, required tests that need to be put in place.

And initially, we can just be establishing requirements like you say, you have to register anything over and in,

in the proposal, I think we said, like you have to register models GPT-4 and above or above GPT-4.

I know that scale is not always going to be the size of models,

not always going to be directly related to capabilities,

but that's something that you can evolve. And interested in the EU AI Act does do

that. So that gets us to evolving licensing requirements. And this is sort of in,

in answer to Yoshua's question: how is this different from a licensing regime?

This evolves to a licensing regime. A licensing regime says there are things you have to do,

like run this test. As a registration regime, it's just tell us about who you are and that as you add requirements say,

well, you have to have done these kinds of tests or you have to use these kind of methods or you have to show us these

kinds of proofs that's evolving to a licensing requirement. This is actually something that is a tool that we can imagine.

We have global markets, countries that want their companies or international companies to have access to global markets.

You can have the same requirement, right? If you want to sell the services of the model.

we probably have to develop something for selling the products that are built on the basis of the

model as well. That can be a step just like we have requirements that we established through the WTO

about what kinds of products are allowed into, into the market.

So that gives you an opportunity for global reach and again,

it gives you a lever. I mean, if you're thinking about this as an economic incentive to build,

then your capacity is to say, well, we can't have access to the global market unless you registered,

unless you and then we've evolved our standards in there. So yes,

people will try to cheat. Black markets and criminals exist all over the place,

but you at least want to have something in place that allows you to activate criminal enforcement,

which we currently do not because it's not illegal to do anything with a frontier model.

You choke off legitimate investors from this and you hopefully we'll get some technical work done

on how we would build not-as-hackable systems of digital proof.

I'm goanna stop. Let's see. I'm just going to say that elements of this are in the executive order and the EU AI Act but

not the, not the legality part of it.

And as Adam mentioned on yesterday, this mandatory registration scheme

is one of the recommendations that came out of our Ditchley meetings.

Gillian Hadfield - Building an Off Switch for AI

Transcript

Alignment Workshop