Comment author: kbog  (EA Profile) 16 August 2017 01:06:03PM *  0 points [-]

I don't think I can argue for intrinsically valuing anything. I agree with not being able to argue ought from is.

The is-ought problem doesn't say that you can't intrinsically value anything. It just says that it's hard. There's lots of ways to argue for intrinsically valuing things, and I have a reason to intrinsically value well-being, so why should I divert attention to something else?

Unless it is omniscient I don't see how it will see all threats to itself.

It will see most threats to itself in virtue of being very intelligent and having a lot of data, and will have a much easier time by not being in direct competition. Basically all known x-risks can be eliminated if you have zero coordination and competition problems.

Also what happens if a decisive strategic advantage is not possible and this hypothetical single AI does not come into existence. What is the strategy for that chunk of probability space?

Democratic oversight, international cooperation, good values in AI, FDT to facilitate coordination, stuff like that.

I'm personally highly skeptical that this will happen.

Okay, but the question was "is a single AI a good thing," not "will a single AI happen".

How would that be allowed if those people might create a competitor AI?

It will be allowed by allowing them to exist without allowing them to create a competitor AI. What specific part of this do you think would be difficult? Do you think that everyone who is allowed to exist must have access to supercomputers free of surveillance?

Comment author: WillPearson 18 August 2017 06:13:11PM *  0 points [-]

The is-ought problem doesn't say that you can't intrinsically value anything

I never said it did, I said it means I can't argue that you should intrinsically value anything. What arguments could I give to a paper clipper to stop its paper clipping ways.

That said I do think I can argue for a plurality of intrinsic values.

1) They allow you to break ties. If there are two situations with equal well being then having a second intrinsic value would give you a way of picking between the two.

2) It might be computationally complicated or informationally complicated to calculate your intrinsic value. Having another that is not at odds with it and generally correlates with it allows you to optimise for that. For example you could optimise for political freedom which would probabilistically lead to more eudaemonia, even if it is not the case that more political freedom always leads to more eudaemonia under all cases. As you can't measure the eudaemonia of everyone.

It will see most threats to itself in virtue of being very intelligent and having a lot of data, and will have a much easier time by not being in direct competition. Basically all known x-risks can be eliminated if you have zero coordination and competition problems.

I am thinking about unknown internal threats. One possibility that I alluded to is if it modifies itself to improve itself, but it does that on shakey premise and destroys itself. Another possibility is that parts of may degrade and/or get damaged and it gets the equivalent of cancers.

I'm personally highly skeptical that this will happen.

Okay, but the question was "is a single AI a good thing," not "will a single AI happen"

I was assuming a single non-morally perfect AI as that seems like the most likely outcome of the drive to a single AI to me.

It will be allowed by allowing them to exist without allowing them to create a competitor AI. What specific part of this do you think would be difficult? Do you think that everyone who is allowed to exist must have access to supercomputers free of surveillance?

If they are not free of surveillance, then they have not left the society. I think it a preferable world if we can allow everyone to have supercomputers because they are smart and wise enough to use them well.

Comment author: WillPearson 18 August 2017 06:04:42PM *  0 points [-]

What are the most important considerations for assessing charities doing uncertain-return stuff?

I think an important one is: How likely is the project to reduce the uncertainty of the return?

E.g. will it decide a crucial consideration

Edit to give more detail:

Resolving a crucial consideration increases the value of all your future research massively. Take for example the question of whether that will be a hard or slow take off. Hard take off favours AI safety now, whereas soft take off favours building political and social institutions that encourage cooperation and avoid wars. As they both have humanity's future on the line they are both equally massively important, conditioned on them being the scenario that might happen.

Resolving the question (or at least driving down the uncertainty) would allow the whole community to focus on the right scenario and get a lot better bang for their buck. Even if it doesn't directly address the problem.

Comment author: purplepeople 10 August 2017 06:16:40PM 1 point [-]

Nitpick: On the "How" tab of the site, it should be "Humanity's autonomy", not "Humanities autonomy".

Comment author: WillPearson 10 August 2017 09:48:12PM 0 points [-]

Thanks fixed. I should put some money towards a copy editor at some point or time to figuring out an automated solution.

Comment author: MichaelPlant 10 August 2017 02:33:07PM 0 points [-]

I don't know who downvoted this, but I think it's rude and unhelpful to downvote a post without leaving an explanation of why you did so, unless the post is blatant spam, which this is not. EAs should be upholding norms of considerateness and encouraging intellectual debate.

Have upvoted to balance out. I may make a substantive comment on autonomy later.

Comment author: WillPearson 10 August 2017 05:51:39PM 0 points [-]

I suspect it might be because I've not couched the website enough in terms of EA? I like EA and would love to work with people from the community and perhaps get the website more EA friendly. I've not been massively encouraged by the EA yet.

There is lots to say about autonomy, so I look forward to any forthcoming comments.

Comment author: Kaj_Sotala 22 July 2017 08:22:05AM 1 point [-]

Another discussion and definition of autonomy, by philosopher John Danaher:

Many books and articles have been written on the concept of ‘autonomy’. Generations of philosophers have painstakingly identified necessary and sufficient conditions for its attainment, subjected those conditions to revision and critique, scrapped their original accounts, started again, given up and argued that the concept is devoid of meaning, and so on. I cannot hope to do justice to the richness of the literature on this topic here. Still, it’s important to have at least a rough and ready conception of what autonomy is and the most general (and hopefully least contentious) conditions needed for its attainment.

I have said this before, but I like Joseph Raz’s general account. Like most people, he thinks that an autonomous agent is one who is, in some meaningful sense, the author of their own lives. In order for this to happen, he says that three conditions must be met:

Rationality condition: The agent must have goals/ends and must be able to use their reason to plan the means to achieve those goals/ends.

Optionality condition: The agent must have an adequate range of options from which to choose their goals and their means.

Independence condition: The agent must be free from external coercion and manipulation when choosing and exercising their rationality.

I have mentioned before that you can view these as ‘threshold conditions’, i.e. conditions that simply have to be met in order for an agent to be autonomous, or you can have a slightly more complex view, taking them to define a three dimensional space in which autonomy resides. In other words, you can argue that an agent can have more or less rationality, more or less optionality, and more or less independence. The conditions are satisfied in degrees. This means that agents can be more or less autonomous, and the same overall level of autonomy can be achieved through different combinations of the relevant degrees of satisfaction of the conditions. That’s the view I tend to favour. I think there possibly is a minimum threshold for each condition that must be satisfied in order for an agent to count as autonomous, but I suspect that the cases in which this threshold is not met are pretty stark. The more complicated cases, and the ones that really keep us up at night, arise when someone scores high on one of the conditions but low on another. Are they autonomous or not? There may not be a simple ‘yes’ or ‘no’ answer to that question.

Anyway, using the three conditions we can formulate the following ‘autonomy principle’ or ‘autonomy test’:

Autonomy principle: An agent’s actions are more or less autonomous to the extent that they meet the (i) rationality condition; (ii) optionality condition and (iii) independence condition.

Comment author: WillPearson 22 July 2017 10:27:00AM 1 point [-]

Thanks. I know I need to do more reading around this. This looks like a good place to start.

Comment author: kbog  (EA Profile) 21 July 2017 04:04:18PM *  4 points [-]

Some of the reasons you gave in favor of autonomy come from a perspective of subjective pragmatic normativity rather than universal moral values, and don't make as much sense when society as a whole is analyzed. E.g.:

You disagree with the larger system for moral reasons, for example if it is using slavery or polluting the seas. You may wish to opt out of the larger system in whole or in part so you are not contributing to the activity you disagree with.

But it's equally plausible that the larger system will be enforcing morally correct standards and a minority of individuals will want to do something wrong (like slavery or pollution).

The larger system is hostile to you. It is an authoritarian or racist government. There are plenty examples of this happening in history, so it will probably happen again.

Individuals could be disruptive or racist, and the government ought to restrain their ability to be hostile towards society.

So when we decide how to alter society as a whole, it's not clear that more autonomy is a good thing. We might be erring on different sides of the line in different contexts.

Moreover, I don't see a reason that we ought to intrinsically value autonomy. The reasons you gave only support autonomy instrumentally through other values. So we should just think about how to reduce catastrophic risks and how to improve the economic welfare of everyone whose jobs were automated. Autonomy may play a role in these contexts, but it will then be context-specific, so our definition of it and analysis of it should be contextual as well.

The autonomy view vastly prefers a certain outcome to the airisk question. It is not in favour of creating a single AI that looks after us all (especially not by uploading)

But by the original criteria, a single AI would (probably) be robust to catastrophe due to being extremely intelligent and having no local competitors. If it is a good friendly AI, then it will treat people as they deserve, not on the basis of thin economic need, and likewise it will always be morally correct. It won't be racist or oppressive. I bet no one will want to leave its society, but if we think that that right is important then we can design an AI which allows for that right.

I think this is the kind of problem you frequently get when you construct an explicit value out of something which was originally grounded in purely instrumental terms - you reach some inappropriate conclusions because future scenarios are often different from present ones in ways that remove the importance of our present social constructs.

Comment author: WillPearson 21 July 2017 06:11:49PM *  0 points [-]

But it's equally plausible that the larger system will be enforcing morally correct standards and a minority of individuals will want to do something wrong (like slavery or pollution).

Both of these would impinge on the vital sets of others though (slavery directly, pollution by disrupting the natural environment people rely on). So it would still be a bad outcome from the autonomy viewpoint if these things happened.

The autonomy viewpoint is only arguing that lots of actions should physically possible for people, all the actions physically possible aren't necessarily morally allowed.

Which of these three scenarios is best?:

  1. No one has guns so no-one gets shot
  2. Everyone has guns and people get shot because they have accidents
  3. Everyone has guns but no-one gets shot because they are well trained and smart.

The autonomy viewpoint argues that the third is the best possible outcome and tries to work towards it. There are legitimate uses for guns.

I don't go into how these things should be regulated, as this is a very complicated subject. I'll just point out that to get the robust free society that I want you would need to not regulate the ability to do these things, but make sure the incentive structures and education is correct.

I don't see a reason that we ought to intrinsically value autonomy. The reasons you gave only support autonomy instrumentally through other values.

I don't think I can argue for intrinsically valuing anything. I agree with not being able to argue ought from is. So the best I can do is either claim it as a value, which I do, and refer to other value systems that people might share. I suppose I could talk about other people who value autonomy in itself. Would you find that convincing?

But by the original criteria, a single AI would (probably) be robust to catastrophe due to being extremely intelligent and having no local competitors.

Unless it is omniscient I don't see how it will see all threats to itself. It may lose a gamble on the logical induction lottery and make an ill-advised change to itself.

Also what happens if a decisive strategic advantage is not possible and this hypothetical single AI does not come into existence. What is the strategy for that chunk of probability space?

If it is a good friendly AI, then it will treat people as they deserve, not on the basis of thin economic need, and likewise it will always be morally correct. It won't be racist or oppressive.

I'm personally highly skeptical that this will happen.

I bet no one will want to leave its society, but if we think that that right is important then we can design an AI which allows for that right.

How would that be allowed if those people might create a competitor AI?

you reach some inappropriate conclusions because future scenarios are often different from present ones in ways that remove the importance of our present social contexts.

Like I said in the article, if I was convinced of decisive strategic advantage my views of the future would be very different. However as I am not, I have to think that the future will remain similar to the present in many ways.

Comment author: Taylor 16 July 2017 05:38:41PM *  3 points [-]

Really appreciate you taking the time to write this up! My initial reaction is that the central point about mindset-shifting seems really right.

My proposal is to explicitly talk about two kinds of EA (these may need catchier names)

It seems (to me) “low-level” and “high-level” could read as value-laden in a way that might make people practicing “low-level” EA (especially in cause areas not already embraced by lots of other EAs) feel like they’re not viewed as “real” EAs and so work at cross-purposes with the tent-broadening goal of the proposal. Quick brainstorm of terms that make some kind of descriptive distinction instead:

  1. cause-blind EA vs. cause-specific or cause-limited EA
  2. broad EA vs. narrow EA
  3. inter-cause vs. intra-cause

(Thoughts/views only my own, not my employer’s.)

Comment author: WillPearson 16 July 2017 08:39:34PM *  0 points [-]

Hmm, maybe

  • Global EA vs local EA
  • Total EA vs focused EA
Comment author: Daniel_Dewey 10 July 2017 07:35:51PM 3 points [-]

Thanks for these thoughts. (Your second link is broken, FYI.)

On empirical feedback: my current suspicion is that there are some problems where empirical feedback is pretty hard to get, but I actually think we could get more empirical feedback on how well HRAD can be used to diagnose and solve problems in AI systems. For example, it seems like many AI systems implicitly do some amount of logical-uncertainty-type reasoning (e.g. AlphaGo, which is really all about logical uncertainty over the result of expensive game-tree computations) -- maybe HRAD could be used to understand how those systems could fail?

I'm less convinced that the "ignored physical aspect of computation" is a very promising direction to follow, but I may not fully understand the position you're arguing for.

Comment author: WillPearson 10 July 2017 09:58:19PM 1 point [-]

Fixed, thanks.

I agree that HRAD might be useful. I read some of the stuff. I think we need a mix of theory and practice and only when we have community where they can feed into each other will we actually get somewhere. When an AI safety theory paper says, "Here is an experiment we can do to disprove this theory," then I will pay more attention than I do.

The "ignored physical aspect of computation" is less about a direction to follow, but more an argument about the type of systems that are likely to be effective and so an argument about which ones we should study. There is no point studying how to make ineffective systems safe if the lessons don't carry over to effective ones.

You don't want a system that puts in the same computational resources trying to decide what brand of oil is best for its bearings as it does to deciding the question of what is a human or not. If you decide how much computational resources you want to put into each class of decision, you start to get into meta-decision territory. You also need to decide how much of your pool you want to put into making that meta-decision as making it will take away from making your other decisions.

I am thinking about a possible system which can allocate resources among decision making systems and this can be used to align the programs (at least somewhat). It cannot align a super intelligent malign program, work needs to done on the initial population of programs in the system, so that we can make sure they do not appear. Or we need a different way of allocating resources entirely.

I don't pick this path because it is an easy path to safety, but because I think it is the only path that leads anywhere interesting/dangerous and so we need to think about how to make it safe.

Comment author: Peter_Hurford  (EA Profile) 09 July 2017 12:48:26AM 4 points [-]

If one disagreed with an HRAD-style approach for whatever reason but still wanted to donate money to maximize AI safety, where should one donate? I assume the Far Future EA Fund?

Comment author: WillPearson 09 July 2017 10:59:40AM 2 points [-]

On the meta side of things:

I found ai impacts recently recently. There is a group I am loosely affiliated that is trying to make a MOOC about ai safety.

If you care about doing something about immense suffering risks (s-risks) you might like the foundational research institute.

There is an overview of other charities but it is more favourable of HRAD style papers.

I would like to set up an organisation that studies autonomy and our response to making more autonomous things (especially with regards to adminstrative autonomy). I have a book slowly brewing. So if you are interested in that get in contact.

Comment author: JesseClifton 07 July 2017 10:13:46PM 1 point [-]

Great piece, thank you.

Regarding "learning to reason from humans", to what extent do you think having good models of human preferences is a prerequisite for powerful (and dangerous) general intelligence?

Of course, the motivation to act on human preferences is another matter - but I wonder if at least the capability comes by default?

Comment author: WillPearson 09 July 2017 10:07:31AM 0 points [-]

My own 2 cents. It depends a bit what form of general intelligence is made first. There are at least two possible models.

  1. Super intelligent agent with a specified goal
  2. External brain lobe

With the first you need to be able to specify a human preferences in the form of a goal. Which enables it to pick the right actions.

The external brain lobe would start not very powerful and not come with any explicit goals but would be hooked into the human motivational system and develop goals shaped by human preferences.

HRAD is explicitly about the first. I would like both to be explored.

View more: Next