Comment author: Peter_Hurford  (EA Profile) 30 March 2017 04:31:41PM 0 points [-]
Comment author: Halstead 30 March 2017 10:47:21AM *  1 point [-]

Hi,

  1. We reference a number of lines of evidence suggesting that donating to AMF does well on sufficientarian, prioritarian, egalitarian criteria. See footnotes 23 and 24. Thus, we provide evidence for our conclusion that 'it is reasonable to believe that AMF does well on these criteria'. This, of course, is epistemically weaker than claims such as 'it is certain that AMF ought to be recommended by prioritarians, egalitarians and sufficientarians'. You seem to suggest that concluding with a weak epistemic claim is inherently problematic, but that can't be right. Surely, if the evidence provided only justifies a weak epistemic claim, making a weak epistemic claim is entirely appropriate.

  2. You seem to criticise us for the movement having not yet provided a comprehensive algorithm mapping values on to actions. But arguing that the movement is failing is very different to arguing that the paper fails in its own terms. It is not as though we frame the paper as: "here is a comprehensive account of where you ought to give if you are an egalitarian or a prioritarian". As you say, more research is needed, but we already say this in the paper.

  3. Showing that 'Gabriel fails to show that EA recommendations rely on utiltiarianism' is a different task to showing that 'EA recommendations do not rely on utilitarianism'. Showing that an argument for a proposition P fails is different to showing that not-P.

Comment author: Stefan_Schubert 30 March 2017 10:17:50AM *  0 points [-]

Philosophy would attain to perfection when the mechanical labourers shall have philosophical heads, or the philosophers shall have mechanical hands.

Thomas Spratt, History of the Royal Society of London

Comment author: weeatquince  (EA Profile) 30 March 2017 09:19:03AM 0 points [-]

This is a good paper and well done to the authors.

I think section 3 is very weak. I am not flagging this as a flaw in the argument just the area that I see the most room for improvement in the paper and/or the most need for follow up research. The authors do say that more research is needed which is good.

Some examples of what I mean by the argument is weak: - The paper says it is "reasonable to believe that AMF does very well on prioritarian, egalitarian, and sufficientarian criteria". "reasonable to believe" is not a strong claim. No one has made any concerted effort to map the values of people who are not utilitarians, to come up with metrics that may represent what such people care about and evaluate charities on these metrics. This could be done but is not happening. - The paper says Iason "fail[s] to show that effective altruist recommendations actually do rely on utilitarianism" but the paper also fail to show that effective altruist recommendations actually do not rely on utilitarianism. - Etc

Why I think more research is useful here: - Because when the strongest case you can make for EA to people with equality as a moral intuition begins by saying "it is reasonable to believe . . . " it is so hard to make EA useful to such people. For example when I meet people new to EA who care a lot about equality making the case that: 'if you care about minimising suffering this 'AMF' thing comes up top and it is reasonable to assume that if you care about equality it also could be at the top because it is effective and helps the poorest' carries a lot less weight than perhaps saying: 'hey we funded a bunch of people who care foremost about equality, like you do, to map out their values and rank charities and this came top.'

Note cross-posting a summarised comment on this paper from a discussion on Facebook https://www.facebook.com/groups/798404410293244/permalink/1021820764618273/?comment_id=1022125664587783

Comment author: Raemon 29 March 2017 11:09:06PM 1 point [-]

How could bad research not make it harder to find good research? When you're looking for the research, you have to look through additional things before you find the good research, and good research is fairly costly to ascertain in the first place.

In response to Utopia In The Fog
Comment author: remmelt  (EA Profile) 29 March 2017 10:11:59AM 2 points [-]

Great to see a nuanced different perspective I'd be interested in how work on existing multi-agent problems can be translated into improving the value-alignment of a potential singleton (reducing the risk of theoretical abstraction uncoupling from reality with).

Amateur question: would it help to also include back-of-the-envelop calculations to make your arguments more concrete?

In response to comment by Tor on Concrete project lists
Comment author: lifelonglearner 29 March 2017 02:06:31AM 1 point [-]

Just want to respond that I'd be interested in doing this sort of thing for a short period of time (a few months) to test to waters.

Comment author: Paul_Christiano 28 March 2017 10:53:56PM 4 points [-]

If you drop the assumption that the agent will be all-powerful and far beyond human intelligence then a lot of AI safety work isn't very applicable anymore, while it increasingly needs to pay attention to multi-agent dynamics

I don't think this is true in very many interesting cases. Do you have examples of what you have in mind? (I might be pulling a no-true-scotsman here, and I could imagine responding to your examples with "well that research was silly anyway.")

Whether or not your system is rebuilding the universe, you want it to be doing what you want it to be doing. Which "multi-agent dynamics" do you think change the technical situation?

the claim isn't that evolution is intrinsically "against" any particular value, it's that it's extremely unlikely to optimize for any particular value, and the failure to do so nearly perfectly is catastrophic

If evolution isn't optimizing for anything, then you are left with the agents' optimization, which is precisely what we wanted. I though you were telling a story about why a community of agents would fail to get what they collectively want. (For example, a failure to solve AI alignment is such a story, as is a situation where "anyone who wants to destroy the world has the option," as is the security dilemma, and so forth.)

Yes, or even implementable in current systems.

We are probably on the same page here. We should figure out how to build AI systems so that they do what we want, and we should start implementing those ideas ASAP (and they should be the kind of ideas for which that makes sense). When trying to figure out whether a system will "do what we want" we should imagine it operating in a world filled with massive numbers of interacting AI systems all built by people with different interests (much like the world is today, but more).

The point you are quoting is not about just any conflict, but the security dilemma and arms races. These do not significantly change with complete information about the consequences of conflict.

You're right.

Unsurprisingly, I have a similar view about the security dilemma (e.g. think about automated arms inspections and treaty enforcement, I don't think the effects of technological progress are at all symmetrical in general). But if someone has a proposed intervention to improve international relations, I'm all for evaluating it on its merits. So maybe we are in agreement here.

Comment author: RomeoStevens 28 March 2017 07:22:54PM *  0 points [-]

Whoops, I somehow didn't see this until now. Scattered EA discourse, shrug.

I am in support of only engaging selectively.

I also agree that there is a significant risk that my views will calcify. I worry about this a fair amount, and I am interested in potential solutions,

great!

I think there is a bit of a false dichotomy between "engage in public discourse" and "let one's views calcify"; unfortunately I think the former does little to prevent the latter.

agreed

I don't understand the claim that "The principles section is an outline of a potential future straightjacket." Which of the principles in that section do you have in mind?

the whole thing. Principles are better as descriptions and not prescriptions :)

WRT preventing views from calcifying, I think it is very very important to actively cultivate something similar to

"But we ran those conversations with the explicit rule that one could talk nonsensically and vaguely, but without criticism unless you intended to talk accurately and sensibly. We could try out ideas that were half-baked or quarter-baked or not baked at all, and just talk and listen and try them again." -Herbert Simon, Nobel Laureate, founding father of the AI field

I've been researching top and breakout performance and this sort of thing keeps coming up again and again. Fortunately, creative reasoning is not magic. It has been studied and has some parameters that can be intentionally inculcated.

This talk gives a brief overview: https://vimeo.com/89936101

And I recommend skimming one of Edward deBono's books, such as six thinking hats. He outlined much of the sort of reasoning of 0 to 1, the Lean Startup, and others way back in the early nineties. It may be that openPhil is already having such conversations internally. In which case, great! That would make me much more bullish on the idea that openPhil has a chance at outsize impact. My main proxy metric is an Umeshism: if you never output any batshit crazy ideas your process is way too conservative.

Comment author: RomeoStevens 28 March 2017 07:07:18PM 2 points [-]

right to exit means right to suicide, right to exit geographically, right to not participate in a process politically etc.

Comment author: RyanCarey 28 March 2017 06:14:42PM 0 points [-]

Of course, if type x research is (in general or in this instance) not very useful, then this is of direct relevance to a post that is an instance of type x research. It seems important not to conflate these, or to move from a defense of the former to a defense of the latter.

You're imposing on my argument a structure that it didn't have. My argument is that prima facie, analysing the concepts of effectiveness is not the most useful work that is presently to be done. If you look at my original post, it's clear that it had a parallel argument structure: i) this post seems mostly not new, and ii) posts of this kind are over-invested. It was well-hedged, and made lots of relative claims ("on the margin", "I am generally not very interested" etc. so it's really weird to be repeatedly told that I was arguing something else.

I think that's fine, but I think it's important not to frame this as merely a disagreement about what kinds of research should be done at the margin, since this is not the source of the disagreement.

The general disagreement about whether philosophical analysis is under-invested is source of about half of the disagreement. I've talked to Stefan and Ben, and I think that I was convinced that philosophical analysis was prima facie under-invested atm, then I would view analysis of principles of effectiveness a fair bit more favorably. I could imagine that if they became fully convinced that practical work was much more neglected then they might want to see more project proposals and literature reviews done too.

Comment author: Zeke_Sherman 28 March 2017 05:53:25PM *  1 point [-]

Thanks for the comments.

Evolution doesn't really select against what we value, it just selects for agents that want to acquire resources and are patient. This may cut away some of our selfish values, but mostly leaves unchanged our preferences about distant generations.

Evolution favors replication. But patience and resource acquisition aren't obviously correlated with any sort of value; if anything, better resource-acquirers are destructive and competitive. The claim isn't that evolution is intrinsically "against" any particular value, it's that it's extremely unlikely to optimize for any particular value, and the failure to do so nearly perfectly is catastrophic. Furthermore, competitive dynamics lead to systematic failures. See the citation.

Shulman's post assumes that once somewhere is settled, it's permanently inhabited by the same tribe. But I don't buy that. Agents can still spread through violence or through mimicry (remember the quote on fifth-generation warfare).

It seems like you are paraphrasing a standard argument for working on AI alignment rather than arguing against it.

All I am saying is that the argument applies to this issue as well.

Over time it seems likely that society will improve our ability to make and enforce deals, to arrive at consensus about the likely consequences of conflict, to understand each others' situations, or to understand what we would believe if we viewed others' private information.

The point you are quoting is not about just any conflict, but the security dilemma and arms races. These do not significantly change with complete information about the consequences of conflict. Better technology yields better monitoring, but also better hiding - which is easier, monitoring ICBMs in the 1970's or monitoring cyberweapons today?

One of the most critical pieces of information in these cases is intentions, which are easy to keep secret and will probably remain so for a long time.

By "don't require superintelligence to be implemented," do you mean systems of machine ethics that will work even while machines are broadly human level?

Yes, or even implementable in current systems.

I think the mandate of AI alignment easily covers the failure modes you have in mind here.

The failure modes here are a different context where the existing research is often less relevant or not relevant at all. Whatever you put under the umbrella of alignment, there is a difference between looking at a particular system with the assumption that it will rebuild the universe in accordance with its value function, and looking at how systems interact in varying numbers. If you drop the assumption that the agent will be all-powerful and far beyond human intelligence then a lot of AI safety work isn't very applicable anymore, while it increasingly needs to pay attention to multi-agent dynamics. Figuring out how to optimize large systems of agents is absolutely not a simple matter of figuring out how to build one good agent and then replicating it as much as possible.

Comment author: Zeke_Sherman 28 March 2017 05:26:15PM *  0 points [-]

Optimizing for a narrower set of criteria allows more optimization power to be put behind each member of the set. I think it is plausible that those who wish to do the most good should put their optimization power behind a single criteria, as that gives it some chance to actually succeed.

Only if you assume that there are high thresholds for achievements.

The best candidate afaik is right to exit, as it eliminates the largest possible number of failure modes in the minimum complexity memetic payload.

I do not understand what you are saying.

Edit: do you mean, the option to get rid of technological developments and start from scratch? I don't think there's any likelihood of that, it runs directly counter to all the pressures described in my post.

In response to Utopia In The Fog
Comment author: Paul_Christiano 28 March 2017 04:34:18PM 9 points [-]

It's great to see people thinking about these topics and I agree with many of the sentiments in this post. Now I'm going to write a long comment focusing on those aspects I disagree with. (I think I probably agree with more of this sentiment than most of the people working on alignment, and so I may be unusually happy to shrug off these criticisms.)

Contrasting "multi-agent outcomes" and "superintelligence" seems extremely strange. I think the default expectation is a world full of many superintelligent systems. I'm going to read your use of "superintelligence" as "the emergence of a singleton concurrently with the development of superintelligence."

I don't consider the "single superintelligence" scenario likely, but I don't think that has much effect on the importance of AI alignment research or on the validity of the standard arguments. I do think that the world will gradually move towards being increasingly well-coordinated (and so talking about the world as a single entity will become increasingly reasonable), but I think that we will probably build superintelligent systems long before that process runs its course.

The future looks broadly good in this scenario given approximately utilitarian values and the assumption that ems are conscious, with a large growing population of minds which are optimized for satisfaction and productivity, free of disease and sickness.

On total utilitarian values, the actual experiences of brain emulations (including whether they have any experiences) don't seem very important. What matters are the preferences according to which emulations shape future generations (which will be many orders of magnitude larger).

"freewheeling evolutionary developments, while continuing to produce complex and intelligent forms of organization, lead to the gradual elimination of all forms of being that we care about"

Evolution doesn't really select against what we value, it just selects for agents that want to acquire resources and are patient. This may cut away some of our selfish values, but mostly leaves unchanged our preferences about distant generations.

(Evolution might select for particular values, e.g. if it's impossible to reliably delegate or if it's very expensive to build systems with stable values. But (a) I'd bet against this, and (b) understanding this phenomenon is precisely the alignment problem!)

(I discuss several of these issues here, Carl discusses evolution here.)

Whatever the type of agent, arms races in future technologies would lead to opportunity costs in military expenditures and would interfere with the project of improving welfare. It seems likely that agents designed for security purposes would have preferences and characteristics which fail to optimize for the welfare of themselves and their neighbors. It’s also possible that an arms race would destabilize international systems and act as a catalyst for warfare.

It seems like you are paraphrasing a standard argument for working on AI alignment rather than arguing against it. If there weren't competitive pressure / selection pressure to adopt future AI systems, then alignment would be much less urgent since we could just take our time.

There may be other interventions that improve coordination/peace more broadly, or which improve coordination/peace in particular possible worlds etc., and those should be considered on their merits. It seems totally plausible that some of those projects will be more effective than work on alignment. I'm especially sympathetic to your first suggestion of addressing key questions about what will/could/should happen.

Not only is this a problem on its own, but I see no reason to think that the conditions described above wouldn’t apply for scenarios where AI agents turned out to be the primary actors and decisionmakers rather than transhumans or posthumans.

Over time it seems likely that society will improve our ability to make and enforce deals, to arrive at consensus about the likely consequences of conflict, to understand each others' situations, or to understand what we would believe if we viewed others' private information.

More generally, we would like to avoid destructive conflict and are continuously developing new tools for getting what we want / becoming smarter and better-informed / etc.

And on top of all that, the historical trend seems to basically point to lower and lower levels of violent conflict, though this is in a race with greater and greater technological capacity to destroy stuff.

I would be more than happy to bet that the intensity of conflict declines over the long run. I think the question is just how much we should prioritize pushing it down in the short run.

“the only way to avoid having all human values gradually ground down by optimization-competition is to install a Gardener over the entire universe who optimizes for human values.”

I disagree with this. See my earlier claim that evolution only favors patience.

I do agree that some kinds of coordination problems need to be solved, for example we must avoid blowing up the world. These are similar in kind to the coordination problems we confront today though they will continue to get harder and we will have to be able to solve them better over time---we can't have a cold war each century with increasingly powerful technology.

There is still value in AI safety work... but there are other parts of the picture which need to be explored

This conclusion seems safe, but it would be safe even if you thought that early AI systems will precipitate a singleton (since one still cares a great deal about the dynamics of that transition).

Better systems of machine ethics which don’t require superintelligence to be implemented (as coherent extrapolated volition does)

By "don't require superintelligence to be implemented," do you mean systems of machine ethics that will work even while machines are broadly human level? That will work even if we need to solve alignment prior long before the emergence of a singleton? I'd endorse both of those desiderata.

I think the main difference in alignment work for unipolar vs. multipolar scenarios is how high we draw the bar for "aligned AI," and in particular how closely competitive it must be with unaligned AI. I probably agree with your implicit claim, that they either must be closely competitive or we need new institutional arrangements to avoid trouble.

Rather than having a singleminded focus on averting a particular failure mode

I think the mandate of AI alignment easily covers the failure modes you have in mind here. I think most of the disagreement is about what kinds of considerations will shape the values of future civilizations.

both working on arguments that agents will be linked via a teleological thread where they accurately represent the value functions of their ancestors

At this level of abstraction I don't see how this differs from alignment. I suspect the details differ a lot, in that the alignment community is very focused on the engineering problem of actually building systems that faithfully pursue particular values (and in general I've found that terms like "teleological thread" tend to be linked with persistently low levels of precision).

Comment author: Askell 28 March 2017 09:56:44AM *  4 points [-]

There are two different claims here: one is "type x research is not very useful" and the other is "we should be doing more type y research at the margin". In the comment above, you seem to be defending the latter, but your earlier comments support the former. I don't think we necessarily disagree on the latter claim (perhaps on how to divide x from y, and the optimal proportion of x and y, but not on the core claim). But note that the second claim is somewhat tangential to the original post. If type x research is valuable, then even though we might want more type y research at the margin, this isn't a consideration against a particular instance of type x research. Of course, if type x research is (in general or in this instance) not very useful, then this is of direct relevance to a post that is an instance of type x research. It seems important not to conflate these, or to move from a defense of the former to a defense of the latter. Above, you acknowledge that type x research can be valuable, so you don't hold the general claim that type x research isn't useful. I think you do hold the view that either this particular instance of research or this subclass of type x research is not useful. I think that's fine, but I think it's important not to frame this as merely a disagreement about what kinds of research should be done at the margin, since this is not the source of the disagreement.

In response to Utopia In The Fog
Comment author: RomeoStevens 28 March 2017 08:31:18AM 2 points [-]

Optimizing for a narrower set of criteria allows more optimization power to be put behind each member of the set. I think it is plausible that those who wish to do the most good should put their optimization power behind a single criteria, as that gives it some chance to actually succeed. The best candidate afaik is right to exit, as it eliminates the largest possible number of failure modes in the minimum complexity memetic payload. Interested in arguments why this might be wrong.

Comment author: Peter_Hurford  (EA Profile) 28 March 2017 02:52:21AM 2 points [-]

We do not plan to continue the Pareto Fellowship in its current form this year. While we thought that it was a valuable experiment, the cost per participant was too high relative to the magnitude of plan changes made by the fellows. We might consider running a much shorter version of the program, without the project period, in the future. The Pareto Fellowship did, however, make us more excited about doing other high-touch mentoring and training with promising members of the effective altruism community.

From CEA's 2017 Fundraising Report.

Comment author: Zeke_Sherman 28 March 2017 02:47:01AM 1 point [-]

This is odd. Personally my reaction is that I want to get to a project before other people do. Does bad research really make it harder to find good research? This doesn't seem like a likely phenomenon to me.

Comment author: Zeke_Sherman 28 March 2017 02:45:09AM 2 points [-]

I think we need more reading lists. There have already been one or two for AI safety, but I've not seen similar ones for poverty, animal welfare, social movements, or other topics.

In response to comment by LKor on Open Thread #36
Comment author: Zeke_Sherman 28 March 2017 02:38:43AM *  2 points [-]

We all know how many problems there are with reputation and status seeking. You would lower epistemic standards, cement power users, and make it harder for outsiders and newcomers to get any traction for their ideas.

If we do something like this it should be for very specific capabilities, like reliability, skill or knowledge in a particular domain, rather than generic reputation. That would make it more useful and avoid some of the problems.

View more: Next