(Cross-posted from LessWrong.)

For context, Jeff Kaufman delivered a speech on effective altruism and cause prioritization at EA Global 2015 entitled 'Why Global Poverty?', which he has transcribed and made available here. It's certainly worth reading.

I was dissatisfied with this speech in some ways. For the sake of transparency and charity, I will say that Kaufman has written a disclaimer explaining that, because of a miscommunication, he wrote this speech in the span of two hours immediately before he delivered it (instead of eating lunch, I would like to add), and that even after writing the text version, he is not entirely satisfied with the result.

I'm not that familiar with the EA community, but I predict that debates about cause prioritization, especially when existential risk mitigation is among the causes being discussed, can become mind-killed extremely quickly. And I don't mean to convey that in the tone of a wise outsider. It makes sense, considering the stakes at hand and the eschatological undertones of existential risk. (That is to say that the phrase 'save the world' can be sobering or gross, depending on the individual.) So, as is always implicit, but is sometimes worth making explicit, I'm criticizing some arguments as I understand them, not any person. I write this precisely because rationality is a common interest of many causes. I'll be focusing on the part about existential risk, as well as the parts that it is dependent upon. Lastly, I'd be interested in knowing if anyone else has criticized this speech in writing or come to conclusions similar to mine. Without further ado:

Jeff Kaufman's explanation of EA and why it makes sense is boilerplate; I agree with it, naturally. I also agree with the idea that certain existential risk mitigation strategies are comparatively less neglected by national governments and thus that risks like these are considerably less likely to be where one can make one's most valuable marginal donation. E.g., there are people who are paid to record and predict the trajectories of celestial objects, celestial mechanics is well-understood, and an impact event in the next two centuries is, with high meta-confidence, far less probable than many other risks. You probably shouldn't donate to asteroid impact risk mitigation organizations if you have to choose a cause from the category of existential risk mitigation organizations. The same goes for most natural (non-anthropogenic) risks.

The next few parts are worth looking at in detail, however:

At the other end we have risks like the development of an artificial intelligence that destroys us through its indifference. Very few people are working on this, there's low funding, and we don't have much understanding of the problem. Neglectedness is a strong heuristic for finding causes where your contribution can go far, and this does seem relatively neglected. The main question for me, though, is how do you know if you're making progress?

Everything before the question seems accurate to me. Furthermore, if I interpret the question correctly, then what's implied is a difference between the observable consequences of global poverty mitigation and existential risk mitigation. I think the implied difference is fair. You can see the malaria evaporating but you only get one chance to build a superintelligence right. (It's worth saying that AI risk is also the example that Kaufman uses in his explanation.)

However, I don't think that this necessarily implies that we can't have some confidence that we're actually mitigating existential risks. This is clear if we dissolve the question. What are the disguised queries behind the question 'How do you know if you're making progress?'

If your disguised query is 'Can I observe the consequences of my interventions and update my beliefs and correct my actions accordingly?', then in the case of existential risks, the answer is "No", at least in the traditional sense of an experiment.

If your disguised query is 'Can I have confidence in the effects of my interventions without observing their consequences?', then that seems like a different, much more complicated question that is both interesting and worth examining further. I'll expand on this conceivably more controversial bit later, so that it doesn't seem like I'm being uncharitable or quoting out of context. Kaufman continues:

First, a brief digression into feedback loops. People succeed when they have good feedback loops. Otherwise they tend to go in random directions. This is a problem for charity in general, because we're buying things for others instead of for ourselves. If I buy something and it's no good I can complain to the shop, buy from a different shop, or give them a bad review. If I buy you something and it's no good, your options are much more limited. Perhaps it failed to arrive but you never even knew you were supposed to get it? Or it arrived and was much smaller than I intended, but how do you know. Even if you do know that what you got is wrong, chances are you're not really in a position to have your concerns taken seriously.

This is a big problem, and there are a few ways around this. We can include the people we're trying to help much more in the process instead of just showing up with things we expect them to want. We can give people money instead of stuff so they can choose the things they most need. We can run experiments to see which ways of helping people work best. Since we care about actually helping people instead of just feeling good about ourselves, we not only can do these things, we need to do them. We need to set up feedback loops where we only think we're helping if we're actually helping.

Back to AI risk. The problem is we really really don't know how to make good feedback loops here. We can theorize that an AI needs certain properties not to just kill us all, and that in order to have those properties it would be useful to have certain theorems proved, and go work on those theorems. And maybe we have some success at this, and the mathematical community thinks highly of us instead of dismissing our work. But if our reasoning about what math would be useful is off there's no way for us to find out. Everything will still seem like it's going well.

I think I get where Kaufman is coming from on this. First, I'm going to use an analogy to convey what I believe to be the commonly used definition of the phrase 'feedback loop'.

If you're an entrepreneur, you want your beliefs about which business strategies will be successful to be entangled with reality. You also have a short financial runway, so you need to decide quickly, which means that you have to obtain your evidence quickly if you want your beliefs to be entangled in time for it to matter. So immediately after you affect the world, you look at it to see what happened and update on it. And this is virtuous.

And of course, people are notoriously bad at remaining entangled with reality when they don't look at it. And this seems like an implicit deficiency in any existential risk mitigation intervention; you can't test the effectiveness of your intervention. You succeed or fail, one time.

Next, let's taboo the phrase 'feedback loop'.

So, it seems like there's a big difference between first handing out insecticidal bed nets and then looking to see whether or not the malaria incidence goes down, and paying some mathematicians to think about AI risk. When the AI researchers 'make progress', where can you look? What in the world is different because they thought instead of not, beyond the existence of an academic paper?

But a big part of this rationality thing is knowing that you can arrive at true beliefs by correct reasoning, and not just by waiting for the answer to smack you in the face.

And I would argue that any altruist is doing the same thing when they have to choose between causes before they can make observations. There are a million other things that the founders of the Against Malaria Foundation could have done, but they took the risk of riding on distributing bed nets, even though they had yet to see it actually work.

In fact, AI risk is not-that-different from this, but you can imagine it as a variant where you have to predict much further into the future, the stakes are higher, and you don't get a second try after you observe the effect of your intervention.

And if you imagine a world where a global authoritarian regime involuntarily reads its citizens' minds as a matter of course, and there it is lawful that anyone who identifies as an EA is to be put in an underground chamber where they are given a minimum income that they may donate as they please, and they are allowed to reason on their prior knowledge only, never being permitted to observe the consequences of their donations, then I bet that EAs would not say, "I have no feedback loop and I therefore cannot decide between any of these alternatives."

Rather, I bet that they would say, "I will never be able to look at the world and see the effects of my actions at a time that affects my decision-making, but this is my best educated guess of what the best thing I can do is, and it's sure as hell better than doing nothing. Yea, my decision is merely rational."

You want observational consequences because they give you confidence in your ability to make predictions. But you can make accurate predictions without being able to observe the consequences of your actions, and without just getting lucky, and sometimes you have to.

But in reality we're not deciding between donating something and donating nothing. We're choosing between charitable causes. But I don't think that the fact that our interventions are less predictable should make us consider the risk more negligible or the prevention thereof less valuable. Above choosing causes where the effects of interventions are predictable, don't we want to choose the most valuable causes? A bias towards causes with consistently, predictably, immediately effective interventions doesn't seem like something that should completely dominate our decision-making process even if there's an alternative cause that can be less predictably intervened upon but that would result in outcomes with extremely high utility if successfully intervened upon.

To illustrate, imagine that you are at some point on a long road, truly in the middle of nowhere, and you see a man whose car has a flat tire. You know that someone else may not drive by for hours, and you don't know how well-prepared the man is for that eventuality. You consider stopping your car to help; you have a spare, you know how to change tires, and you've seen it work before. And if you don't do it right the first time for some weird reason, you can always try again.

But suddenly, you notice that there is a person lying motionless on the ground, some ways down the road; far, but visible. There's no cellphone service, it would take an ambulance hours to get here unless they happened to be driving by, and you have no medical training or experience.

I don't know about you, but even if I'm having an extremely hard time thinking of things to do about a guy dying on my watch in the middle of nowhere, the last thing I do is say, "I have no idea what to do if I try to save that guy, but I know exactly how to change a tire, so why don't I just change the tire instead." Because even if I don't know what to do, saving a life is so much more important than changing a tire that I don't care about the uncertainty. And maybe if I went and actually tried saving his life, even if I wasn't sure how to go about it, it would turn out that I would find a way, or that he needed help, but he wasn't about to die immediately, or that he was perfectly fine all along. And I never would've known if I'd changed a tire and driven in the opposite direction.

And it doesn't mean that the strategy space is open season. I'm not going to come up with a new religion on the spot that contains a prophetic vision that this man will survive his medical emergency, nor am I going to try setting him on fire. There are things that will obviously not work without me trying them out. And that can be built on with other ideas that are not-obviously-wrong-but-may-turn-out-to-be-wrong-later. It's great to have an idea of what you can know is wrong even if you can't try anything. Because not being able to try more than once is precisely the problem.

If we stop talking about what rational thinking feels like, and just start talking about rational thinking with the usual words, then what I'm getting at is that, in reality, there is an inside view to the AI risk arguments. You can always talk about confidence levels outside of an argument, but it helps to go into the details of the inside view, to see where our uncertainty about various assertions is greatest. Otherwise, where is your outside estimate even coming from, besides impression?

We can't run an experiment to see if the mathematics of self-reference, for example, is a useful thing to flesh out before trying to solve the larger problem of AI risk, but there are convincing reasons that it is. And sometimes that's all you have at the time.

And if you ever ask me, "Why does your uncertainty bottom out here?", then I'll ask you "Why does your uncertainty bottom out there?" Because it bottoms out somewhere, even if it's at the level of "I know that I know nothing," or some other similarly useless sentimentAnd it's okay.

But I will say that this state of affairs is not optimal. It would be nice if we could be more confident about our reasoning in situations where we aren't able to make predictions, and then perform interventions, and then make observations that we can update on, and then try again. It's great to have medical training in the middle of nowhere.

And I will also say that I imagine that Kaufman is not talking about it being a fundamentally bad idea forever to donate to existential risk mitigation, but that it just doesn't seem like a good idea right now, because we don't know enough about when we should be confident in predictions that we can't test before we have to take action.

But if you know you're confused about how to determine the impact of interventions intended to mitigate existential risks, it's almost as if you should consider trying to figure out that problem itself. If you could crack the problem of mitigating existential risks, it would blow global poverty out of the water. And the problem doesn't immediately seem completely obviously intractable.

In fact, it's almost as if the cause you should choose is the research of existential risk strategy (a subset of cause prioritization). And, if you were to write a speech about it, it seems like it would be a good idea to make it really clear that that's probably very impactful, because value of information counts.

And so, when you read a speech that you claim is entitled 'Why Global Poverty?', I read a speech entitled 'Why Existential Risk Strategy Research?'

8

0
0

Reactions

0
0

More posts like this

Comments8
Sorted by Click to highlight new comments since: Today at 11:33 AM

I found a lot of this post disconcerting because of how often you linked to LessWrong posts, even when doing so didn't add anything. I think it would be better if you didn't rely on LW concepts so much and just say what you want to say without making outside references.

[I]magine that you are at some point on a long road, truly in the middle of nowhere, and you see a man whose car has a flat tire. You know that someone else may not drive by for hours, and you don't know how well-prepared the man is for that eventuality. You consider stopping your car to help; you have a spare, you know how to change tires, and you've seen it work before. And if you don't do it right the first time for some weird reason, you can always try again.

But suddenly, you notice that there is a person lying motionless on the ground, some ways down the road; far, but visible. There's no cellphone service, it would take an ambulance hours to get here unless they happened to be driving by, and you have no medical training or experience.

I don't know about you, but even if I'm having an extremely hard time thinking of things to do about a guy dying on my watch in the middle of nowhere, the last thing I do is say, "I have no idea what to do if I try to save that guy, but I know exactly how to change a tire, so why don't I just change the tire instead." Because even if I don't know what to do, saving a life is so much more important than changing a tire that I don't care about the uncertainty.

I really like this bit.

I really like this bit.

Thank you.

I found a lot of this post disconcerting because of how often you linked to LessWrong posts, even when doing so didn't add anything. I think it would be better if you didn't rely on LW concepts so much and just say what you want to say without making outside references.

I mulled over this article for quite awhile before posting it, and this included the pruning of many hyperlinks deemed unnecessary. Of course, the links that remain are meant to produce a more concise article, not a more opaque one, so what you say is unfortunate to read. I would be interested in some specific examples of links or idiosyncratic language that either don't add value to or subtract value from the article.

It sure isn't good if I'm coming off as a crank though. I consider the points within this article very important.

Specific examples:

  • Linking to the Wikipedia pages for effective altruism, existential risk, etc. is unnecessary because almost all of your audience will be familiar with these terms.
  • For lots of your links, I had no problem understanding what you meant without reading the associated LW post.
  • You used a lot of LW jargon where you could have phrased things differently to avoid it: "dissolve the question", "disguised queries", "taboo", "confidence levels outside of an argument".
  • Lots of your links were tangential or just didn't add anything to what you already said: "a wise outsider", your three links for "save the world", "the commonly used definition", "you can arrive at true beliefs...", "but they took the risk of riding...", "useless sentiment", "and it's okay".

I believe the following links were fine and you could leave them in: "mind-killed", "eschatology", "a common interest of many causes", "you can see malaria evaporating", "Against Malaria Foundation" (although I'd link to the website rather than the Wikipedia page), "Existential Strategy Research". I'd remove all the others. Although you might want to remove some of these too—each of links to LessWrong posts on this list is fine on its own, but you probably don't want to have more than one or two links to the same website/author in an article of this length. Hope that helps.

You can rephrase LW jargon with what the jargon represents (in LW jargon, "replace the symbol with the substance"):

For one example, instead of saying:

I'm not that familiar with the EA community, but I predict that debates about cause prioritization, especially when existential risk mitigation is among the causes being discussed, can become mind-killed extremely quickly. And I don't mean to convey that in the tone of a wise outsider. It makes sense, considering the stakes at hand and the eschatological undertones of existential risk. (That is to say that the phrase 'save the world' can be sobering or gross, depending on the individual.) So, as is always implicit, but is sometimes worth making explicit, I'm criticizing some arguments as I understand them, not any person. I write this precisely because rationality is a common interest of many causes. I'll be focusing on the part about existential risk, as well as the parts that it is dependent upon. Lastly, I'd be interested in knowing if anyone else has criticized this speech in writing or come to conclusions similar to mine. Without further ado:

Say:

I'm not that familiar with the EA community, but I predict that debates about cause prioritization, especially when existential risk mitigation is among the causes being discussed, can become the kinds of conversations where biases make it too hard to have a discussion based just on the facts. And I don't mean to convey that in the tone of someone outside the EA movement trying to appear smart. It makes sense, considering the stakes at hand and the connections between existential risk and weird beliefs of "life after death". (That is to say that the phrase 'save the world' can be sobering or gross, depending on the individual.) So, as is always implicit, but is sometimes worth making explicit, I'm criticizing some arguments as I understand them, not any person. I write this precisely because having more rationality is important for advancing every EA cause. I'll be focusing on the part about existential risk, as well as the parts that it is dependent upon. Lastly, I'd be interested in knowing if anyone else has criticized this speech in writing or come to conclusions similar to mine. Without further ado:

Seconded. Maybe it's normal on LW, but rather than being merely disconcerting, it's sort of worrying when people rely on an entire edifice of concepts derived from a rather controversial website.

I agree with the sentiment that is epitomized in the section that Micheal quoted. That said:

There are a million other things that the founders of the Against Malaria Foundation could have done, but they took the risk of riding on distributing bed nets, even though they had yet to see it actually work.

In 2004 they already had a large body of evidence to draw on to make the educated guess that if it has worked before, it will probably work again. And I’m also using AMF as an analogy here. It’s common practice to test an intervention through RCTs and other trials and if it works, then to roll it out at large scale without any more trails (apart from some cheap proxy measures without control group). It’s this experience that allows the incarcerated EAs to make educated guesses without further feedback loops.

AI risk, however, is novel and unusual in many ways, so there is little experience like that to inform any guesses, little experience that extrapolates to the field. We’re at the stage where J-PAL would come up with interventions and run RCTs on them to see if any of them have any positive effect, but we can’t do that.

But “little experience” was not meant as facetious overstatement. There are some interventions were many people have somewhat more solidly positive priors, like awareness-raising among AI researchers.

So while I agree with Jeff that the extreme dearth of feedback loops in the field is a great handicap for any proposed intervention, I also agree with you that we should tend to that dying person first and then fix the tire.

I agree with this. It's the right way to take this further by getting rid of leaky generalizations like 'Evidence is good, no evidence is bad," and also to point out what you pointed out: is the evidence still virtuous if it's from the past and you're reasoning from it? Confused questions like that are a sign that things have been oversimplified. I've thought about the more general issues behind this since I wrote this, since I actually posted this on LW over two weeks ago. (I've been waiting for karma.) In the interim, I found an essay on Facebook by Eliezer Yudkowsky that gets to the core of why these are bad heuristics, among other things.

There are many issues but many issues comes from the laws and rights of the citizens. With the lack of it, different issues occurs. For this, one should know all their poverty and citizens laws and rights. http://www.legisocial.fr