Hide table of contents

AI is getting more powerful, and at some point could cause significant damage.

There's a certain amount of damage that an AI could do that would scare the whole world (similar effects on government and populace psychology as coronavirus -- both willing to make sacrifices). The AI that could cause this (I naively expect) could be well short of the sophistication / power needed to really "rule the world", be unstoppable by humans.

So it seems likely to me (but not certain) that we will get a rude (coronavirus-like) awakening as a species before it gets to the point that we are totally helpless to suppress AI. This awakening would give us the political will / sense of urgency to be willing and feel compelled to do something to limit AI.

(Limiting / suppressing AI: making it hard to make supercomputers, concentrate compute -- technologically, legally. Also, maybe, making the world less computer-legible so that AI that do get made have less data to work with / have less connection to the world. Making it so that the inevitable non-aligned AI are stoppable. Or anything else along those lines.)

It seems like if suppressing AI were easy / safe, that would have been the first choice of AI safety people, at least until such time as alignment is thoroughly solved. But it seems like most discussion is about alignment, not suppression, I would assume on the assumption that suppression is not actually a viable option. However, given the possibility that governments may all be scrambling to do something in the wake of a "coronavirus AI", what kind of AI suppression techniques would they be likely to try? What problems could come from them? (One obvious fear being that a government powerful enough to suppress AI could itself cause a persistent dystopia.) Is there a good, or at least better, way to deal with this situation, which EAs might work toward?

16

0
0

Reactions

0
0
New Answer
New Comment

1 Answers sorted by

I think this is an interesting question. I don't have a solid answer, but here are some related thoughts:

  • How likely we are to land in this scenario in the first place, and what shape it might take, seems related to:
    • Questions around how "hard", "fast", and/or "discontinuous" AI takeoff will be
    • Questions like "Will we know when transformative AI is coming soon? How far in advance? How confidently?"
    • Questions like "Would there be clearer evidence of AI risk in future, if it’s indeed quite risky? Will that lead to better behaviours regarding AI safety and governance?"
    • (For notes and sources on those questions, see Crucial questions for longtermists, and particularly this doc.)
  • Your question as a whole seems similar to the last of the questions listed above.
    • And you seem to highlight the interesting idea that clearer evidence of AI risk in future (via a "sub-existential" catastrophe) could lead to worse behaviours regarding AI safety and governance.
    • And you also seem to highlight that we can/should think now about how to influence what behaviours might occur at that point (rather than merely trying to predict behaviours).

It seems like if suppressing AI were easy / safe, that would have been the first choice of AI safety people, at least until such time as alignment is thoroughly solved

  • My tentative impression is that this is true of many AI safety people, but I'm not sure it's true of all of them. That is, it's plausible to me that a decent number of people concerned about AI risk might not want to "suppress AI" even if this was tractable and wouldn't pose risks of e.g. making mainstream AI researchers angry at longtermists.
    • Here's one argument for that position: There are also other existential risks, and AI might help us with many of them. If you combine that point with certain empirical beliefs, it might suggest that slowing down (non-safety-focused) AI research could actually increase existential risk. (See Differential technological development: Some early thinking.)
    • (I'm not saying that that conclusion is correct; I don't know what the best estimates of the empirical details would reveal.)

One obvious fear being that a government powerful enough to suppress AI could itself cause a persistent dystopia.

  • I think it's slightly worse than this; I think permanent and complete suppression of AI would probably itself be an existential catastrophe, as it seems it would likely result in humanity falling far short of fulfilling its potential.
    • This seems related to Bostrom's notion of "plateauing". (I also review some related ideas here.)
    • This is distinct from (and in addition to) your point that permanent and complete suppression of AI is evidence that a government is powerful enough to cause other bad outcomes.
    • (This isn't a strong argument against temporary or partial suppression of AI, though.)
Comments4
Sorted by Click to highlight new comments since:

(Minor, tangential point)

Making it so that the inevitable non-aligned AI are stoppable

I don't think it's inevitable that there'll ever be a "significantly" non-aligned AI that's "significantly" powerful, let alone "unstoppable by default". (I'm aware that that's not a well-defined sentence.)

In a trivial sense, there are already non-aligned AIs, as shown e.g. by the OpenAI boat game example. But those AIs are already "stoppable".

If you mean to imply that it's inevitable that there'll be an AI that (a) is non-aligned in a way that's quite bad (rather than perhaps slightly imperfect alignment that never really matters much), and (b) would be unstoppable if not for some effort by longtermist-type-people to change that situation, then I'd disagree. I'm not sure how likely that is, but it doesn't seem inevitable.

(It's also possible you didn't mean "inevitable" to be interpreted literally, and/or that you didn't think much about the precise phrasing you used in that particular sentence.)

Yeah, I wasn't being totally clear with respect to what I was really thinking in that context. I was thinking "from the point of view of people who have just been devastated by some not-exactly superintelligent but still pretty smart AI that wasn't adequately controlled, people who want to make that never happen again, what would they assume is the prudent approach to whether there will be more non-aligned AI someday?", figuring that they would think "Assume that if there are more, it is inevitable that there will be some non-aligned ones at some point". The logic being that if we don't know how to control alignment, there's no reason to think there won't someday be significantly non-aligned ones, and we should plan for that contingency.

if we don't know how to control alignment, there's no reason to think there won't someday be significantly non-aligned ones, and we should plan for that contingency.

I at least approximately agree with that statement. 

I think there'd still be some reasons to think there won't someday be significantly non-aligned AIs. For example, a general argument like: "People really really want to not get killed or subjugated or deprived of things they care about, and typically also want that for other people to some extent, so they'll work hard to prevent things that would cause those bad things. And they've often (though not always) succeeded in the past." 

(Some discussions of this sort of argument can be found in the section on "Should we expect people to handle AI safety and governance issues adequately without longtermist intervention?" in Crucial questions.)

But I don't think those arguments make significantly non-aligned AIs implausible, let alone impossible. (Those are both vague words. I could maybe operationalise that as something like a 0.1-50% chance remaining.) And I think that that's all that's required (on this front) in order for the rest of your ideas in this post to be relevant.

In any case, both that quoted statement of yours and my tweaked version of it seem very different from the claim "if we don't currently know how to align/control AIs, it's inevitable there'll eventually be significantly non-aligned AIs someday"? 

In any case, both that quoted statement of yours and my tweaked version of it seem very different from the claim "if we don't currently know how to align/control AIs, it's inevitable there'll eventually be significantly non-aligned AIs someday"?

Yes, I agree that there's a difference.

I wrote up a longer reply to your first comment (the one marked "Answer'), but then I looked up your AI safety doc and realized that I might better read through the readings in that first.

Curated and popular this week
Relevant opportunities