Comment author: EricHerboso  (EA Profile) 13 January 2017 07:30:50PM *  7 points [-]

I agree: it is indeed reasonable for people to have read our estimates the way they did. But when I said that we don't want others to "get the wrong idea", I'm not claiming that the readers were at fault. I'm claiming that the ACE communications staff was at fault.

Internally, the ACE research team was fairly clear about what we thought about leafleting in 2014. But the communications staff (and, in particular, I) failed to adequately get across these concerns at the time.

Later, in 2015 and 2016, I feel that whenever an issue like leafleting came up publicly, ACE was good about clearly expressing our reservations. But we neglected to update the older 2014 page with the same kind of language that we now use when talking about these things. We are now doing what we can to remedy this, first by including a disclaimer at the top of the older leafleting pages, and second by planning a full update of the leafleting intervention page in the near future.

Per your concern about cost-effectiveness estimates, I do want to say that our research team will be making such calculations public on our Guesstimate page as time permits. But for the time being, we had to take down our internal impact calculator because the way that we used it internally did not match the ways others (like Slate Star Codex) were using it. We were trying to err on the side of openness by keeping it public for as long as we did, but in retrospect there just wasn't a good way for others to use the tool in the way we used it internally. Thankfully, the Guesstimate platform includes upper and lower bounds directly in the presented data, so we feel it will be much more appropriate for us to share with the public.

You said "I think the error was in the estimate rather than in expectation management" because you felt the estimate itself wasn't good; but I hope this makes it more clear that we feel that the way we were internally using upper and lower bounds was good; it's just that the way we were talking about these calculations was not.

Internally, when we look at and compare animal charities, we continue to use cost effectiveness estimates as detailed on our evaluation criteria page. We intend to publicly display these kinds of calculations on Guesstimate in the future.

As you've said, the lesson should not be for people to trust things others say less in general. I completely agree with this sentiment. Instead, when it comes to us, the lessons we're taking are: (1) communications staff needs to better explain our current stance on existing pages, (2) comm staff should better understand that readers may draw conclusions solely from older pages, without reading our more current thinking on more recently published pages, and (3) research staff should be more discriminating on what types of internal tools are appropriate for public use. There may also be further lessons that can be learned from this as ACE staff continues to discuss these issues internally. But, for now, this is what we're currently thinking.

Comment author: JBeshir 13 January 2017 09:30:39PM 4 points [-]

This all makes sense, and I think it is a a very reasonable perspective. I hope this ongoing process goes well.

Comment author: Gleb_T  (EA Profile) 13 January 2017 12:23:26PM -4 points [-]

Sarah's post highlights some of the essential tensions at the heart of Effective Altruism.

Do we care about "doing the most good that we can" or "being as transparent and honest as we can"? These are two different value sets. They will sometimes overlap, and in other cases will not.

And please don't say that "we do the most good that we can by being as transparent and honest as we can" or that "being as transparent and honest as we can" is best in the long term. Just don't. You're simply lying to yourself and to everyone else if you say that. If you can't imagine a scenario where "doing the most good that we can" or "being as transparent and honest as we can" are opposed, you've just suffered from a failure mode by flinching away from the truth.

So when push comes to shove, which one do we prioritize? When we have to throw the switch and have the trolley crush either "doing the most good" or "being as transparent and honest as we can," which do we choose?

For a toy example, say you are talking to your billionaire uncle on his deathbed and trying to convince him to leave money to AMF instead of his current favorite charity, the local art museum. You know he would respond better if you exaggerate the impact of AMF. Would you do so, whether lying by omission or in any other way, in order to get much more money for AMF, given that no one else would find out about this situation? What about if you know that other family members are standing in the wings and ready to use all sorts of lies to advocate for their favorite charities?

If you do not lie, that's fine, but don't pretend that you care about doing the most good, please. Just don't. You care about being as transparent and honest as possible over doing the most good.

If you do lie to your uncle, then you do care about doing the most good. However, you should consider at what price point you will not lie - at this point, we're just haggling.

The people quoted in Sarah's post all highlight how doing the most good sometimes involves not being as transparent and honest as we can (including myself). Different people have different price points, that's all. We're all willing to bite the bullet and sometimes send that trolley over transparency and honesty, whether questioning the value of public criticism such as Ben or appealing to emotions such as Rob or using intuition as evidence such as Jacy, for the sake of what we believe is the most good.

As a movement, EA has a big problem with believing that ends never justify the means. Yes, sometimes ends do justify the means - at least if we care about doing the most good. We can debate whether we are mistaken about the ends not justifying the means, but using insufficient means to accomplish the ends is just as bad as using excessive means to get to the ends. If we are truly serious about doing the most good as possible, we should let our end goal be the North Star, and work backward from there, as opposed to hobbling ourselves by preconceived notions of "intellectual rigor" at the cost of doing the most good.

Comment author: JBeshir 13 January 2017 02:55:58PM *  2 points [-]

I at least would say that I care about doing the most good that I can, but am also mindful of the fact that I run on corrupted hardware, which makes ends justifying means arguments unreliable, per EY's classic argument (http://lesswrong.com/lw/uv/ends_dont_justify_means_among_humans/)

""The end does not justify the means" is just consequentialist reasoning at one meta-level up. If a human starts thinking on the object level that the end justifies the means, this has awful consequences given our untrustworthy brains; therefore a human shouldn't think this way. But it is all still ultimately consequentialism. It's just reflective consequentialism, for beings who know that their moment-by-moment decisions are made by untrusted hardware."

This doesn't mean I think there's never a circumstance where you need to breach a deontological rule; I agree with EY when they say "I think the universe is sufficiently unkind that we can justly be forced to consider situations of this sort.". This is the reason under Sarah's definition of absolutely binding promises, I would simply never make such a promise- I might say that I would try my best and to the best of my knowledge there was nothing that would prevent me from doing a thing, or something like that- but I think the universe can be amazingly inconvenient and don't want to be a pretender at principles I would not actually in extremis live up to.

The theory I tend to operate under I think of as "biased naive consequentialism", where I do naive consequentialism- estimating out as far as I can see easily- then introduce heavy bias against things which are likely to have untracked bad consequences, e.g. lying, theft. (I kind of am amused by how all the adjectives in the description are negative ones.). But under a sufficiently massive difference, sure, I'd lie to an axe murderer. This means there is a "price", somewhere. This is probably most similar to the concept of "way utilitarianism", which I think is way better than either act or rule utilitarianism, and is discussed as a sort of steelman of Mohist ideas (https://plato.stanford.edu/entries/mohism/).

One of the things I take from the thinking around the non-central fallacy aka the worst argument in the world (http://lesswrong.com/lw/e95/the_noncentral_fallacy_the_worst_argument_in_the/) is that one should smoothly reduce the strength of such biases for examples which are very atypical of the circumstances the bias was intended for, so as to not have weird sharp edges near category borders.

All this is to say that in weird extreme edge cases, under conditions of perfect knowledge, I think what people do is not important. It's okay to have a price. But in the central cases, in actual life, I think they should have either a very strong bias against deception and for viewing deceptive behaviour poorly, or an outright deontological prohibition if they can't reliably maintain that.

If I was to say one thing I think is a big problem, it's that in practice some people's price seems to be only able to be infinite or zero (or even negative- a lot of people seem to get tempted by cool "ends justify means" arguments which don't even look prima facie like they'd actually have positive utility. I mean, trading discourse and growth for money in a nascent movement is, um, even naive utilitarianism can track far enough out to see the problems there, you have to have an intuitive preference for deception to favour it).

I disagree with you in that I think infinite works fine almost always, so it wouldn't be a big problem if everyone had that- I'd be very happy if all the people who had their price to cheat at around zero moved it to infinite. But I agree with you in that infinite isn't actually the correct answer for an ideal unbiased reasoner, just not that this should affect how humans behave while under the normal circumstances that are the work of the EA movement.

The alarming part for me is that I think in general these debates do, because people erroneously jump from "a hypothetical scenario with a hypothetical perfect reasoner would not behave deontologically" to sketchiness in practice.

Comment author: Ben_West  (EA Profile) 13 January 2017 12:38:47AM 11 points [-]

ACE's primary output is its charity recommendations, and I would guess that it's "top charities" page is viewed ~100x more than the leafleting page Sarah links to.

ACE does not give the "top charity" designation to any organization which focuses primarily on leafleting, and e.g. the page for Vegan Outreach explicitly states that VO is not considered a top charity because of its focus on leafleting and the lack of robust research on that:

We have some concerns that Vegan Outreach has relied too heavily on poor sources of evidence to determine the effectiveness of leafleting as compared to other interventions... Why didn’t Vegan Outreach receive our top recommendation? Although we are impressed with Vegan Outreach’s recent openness to change and their attempts to measure their effectiveness, we still have reservations about their heavy focus on leafleting programs

You are proposing that ACE says negative things on its most prominent pages about leafleting, but left some text buried in a back page that said good things about leafleting as part of a dastardly plot to increase donations to organizations they don't even recommend.

This seems unlikely to me, to put it mildly, but more importantly: it's incredibly important that we assume others are acting in good faith. I disagree with you about this, but I don't think that you are trying to "throw out actually having discourse on effectiveness". This, more than any empirical fact about the likelihood of your hypothesis, is why I think your comment is unhelpful.

Comment author: JBeshir 13 January 2017 10:08:12AM *  5 points [-]

This definitely isn't the kind of deliberate where there's an overarching plot, but it's not distinguishable from the kind of deliberate where a person sees a thing they should do or a reason to not write what they're writing and knowingly ignores it, though I'd agree in that I think it's more likely they flinched away unconsciously.

It's worth noting that while Vegan Outreach is not listed as a top charity it is listed as a standout charity, with their page here: https://animalcharityevaluators.org/research/charity-review/vegan-outreach/

I don't think it is good to laud positive evidence but refer to negative evidence only via saying "there is a lack of evidence", which is what the disclaimers do- in particular there's no mention of the evidence against there being any effect at all. Nor is it good to refer to studies which are clearly entirely invalid as merely "poor" while still relying on their data. It shouldn't be "there is good evidence" when there's evidence for, and "the evidence is still under debate" when there's evidence against, and there shouldn't be a "gushing praise upfront, provisos later" approach unless you feel the praise is still justified after the provisos. And "have reservations" is pretty weak. These are not good acts from a supposedly neutral evaluator.

Until the revision in November 2016, the VO page opened with: "Vegan Outreach (VO) engages almost exclusively in a single intervention, leafleting on behalf of farmed animals, which we consider to be among the most effective ways to help animals.", as an example of this. Even now I don't think it represents the state of affairs well.

If in trying to resolve the matter of whether it has high expected impact or not, you went to the main review on leafleting (https://animalcharityevaluators.org/research/interventions/leafleting/), you'd find it began with "The existing evidence on the impact of leafleting is among the strongest bodies of evidence bearing on animal advocacy methods.".

This is a very central Not Technically a Lie (http://lesswrong.com/lw/11y/not_technically_lying/); the example of a not-technically-a-lie in that post being using the phrase "The strongest painkiller I have." to refer to something with no painkilling properties when you have no painkillers. I feel this isn't something that should be taken lightly:

"NTL, by contrast, may be too cheap. If I lie about something, I realize that I'm lying and I feel bad that I have to. I may change my behaviour in the future to avoid that. I may realize that it reflects poorly on me as a person. But if I don't technically lie, well, hey! I'm still an honest, upright person and I can thus justify visciously misleading people because at least I'm not technically dishonest."

The disclaimer added now helps things, but good judgement should have resulted in an update and correction being transparently issued well before now.

The part which strikes me as most egregious was in the deprioritising of updating a review on what was described in a bunch of places as the most cost effective (and therefore most effective) intervention. I can't see any reason for that, other than that the update would have been negative.

There may not have been conscious intent behind this- I could assume that this was as a result of poor judgement rather than design- but it did mislead the discourse on effectiveness, that already happened, and not as a result of people doing the best thing given information available to them but as a result of poor decisions given this information. Whether it got more donations or not is unclear- it might have tempted more people into offsetting, but on the other hand each person who did offsetting would have paid less because they wouldn't have actually offset themselves.

However something like this is handled is also how a bad actor would be handled, because a bad actor would be indistinguishable from this; if we let this by without criticism and reform, then bad actors would also be let by without criticism and reform.

I think when it comes to responding to some pretty severe stuff of this sort, even if you assume the people made them in good faith and just made some rationality failings, more needs to be said than "mistakes were made, we'll assume you're doing the best you can to not make them again". I don't have a grand theory of how people should react here, but it needs to be more than that.

My inclination is to at the least frankly express how severe I think it is- even if it's not the nicest thing I could say.

Comment author: EricHerboso  (EA Profile) 13 January 2017 01:14:52AM 5 points [-]

Well said, Erika. I'm happy with most of these changes, though I'm sad that we have had to remove the impact calculator in order to ensure others don't get the wrong idea about how seriously such estimates should be taken. Thankfully, Allison plans on implementing a replacement for it at some point using the Guesstimate platform.

For those interested in seeing the exact changes ACE has made to the site, see the disclaimer at the top of the leafleting intervention page and the updates to our mistakes page.

Comment author: JBeshir 13 January 2017 09:55:36AM *  2 points [-]

Thank you for the response, and I'm glad that it's being improved, and that there seems to be a honest interest in doing better.

I feel "ensure others don't get the wrong idea about how seriously such estimates should be taken" is understating things- it should be reasonable for people to ascribe some non-zero level of meaning to issued estimates, and especially it should be that using them to compare between charities doesn't lead you massively astray. If it's "the wrong idea" to look at an estimate at all, because it isn't the true best reasoned expectation of results the evaluator has, I think the error was in the estimate rather than in expectation management, and find the deflection of responsibility here to the people who took ACE at all seriously concerning.

The solution here shouldn't be for people to trust things others say less in general.

Compare, say, GiveWell's analysis of LLINs (http://www.givewell.org/international/technical/programs/insecticide-treated-nets#HowcosteffectiveisLLINdistribution); it's very rough and the numbers shouldn't be assumed to be close to right (and responsibly, they describe all this), but their methodology makes them viable for comparison purposes.

Cost-effectiveness is important- it is the measure of where putting your money does the most good and how much good you can expect to do, and a fully inclusive of risks and data issues cost effectiveness estimate is basically what one is arriving at when one determines what is effective. Even if you use other selection strategies for top charities, incorrect cost effectiveness estimates are not good.

Comment author: Fluttershy 12 January 2017 04:24:29AM 9 points [-]

I should add that I'm grateful for the many EAs who don't engage in dishonest behavior, and that I'm equally grateful for the EAs who used to be more dishonest, and later decided that honesty was more important (either instrumentally, or for its own sake) to their system of ethics than they'd previously thought. My insecurity seems to have sadly dulled my warmth in my above comment, and I want to be better than that.

Comment author: JBeshir 12 January 2017 02:06:22PM 2 points [-]

I find it difficult to combine "I want to be nice and sympathetic and forgiving of people trying to be good people and assume everyone is" with "I think people are not taking this seriously enough and want to tell you how seriously it should be taken". It's easier to be forgiving when you can trust people to take it seriously.

I've kind of erred on the side of the latter today, because "no one criticises dishonesty or rationalisation because they want to be nice" seems like a concerning failure mode, but it'd be nice if I were better at combining both.

Comment author: JBeshir 12 January 2017 02:01:50PM 10 points [-]

One very object-level thing which could be done to make longform, persistent, not hit-and-run discussion in this particular venue easier: Email notifications of comments to articles you've commented in.

There doesn't seem to be a preference setting for that, and it doesn't seem to be default, so it's only because I remember to come check here repeatedly that I can reply to things. Nothing is going to be as good at reaching me as Facebook/other app notifications on my phone, but email would do something.

Comment author: Ben_West  (EA Profile) 12 January 2017 01:34:42AM 7 points [-]

apparently the main evaluator for the animal rights wing of the EA movement has already decided to join it and throw out actually having discourse on effectiveness in favour of plundering their reputation for more donations

This seems like an exaggerated and unhelpful thing to say.

Comment author: JBeshir 12 January 2017 01:46:17PM 1 point [-]

Perhaps. It's certainly what the people suggesting that deliberate dishonesty would be okay are suggesting, and it is what a large amount of online advocacy does, and it is in effect what they did, but they probably didn't consciously decide to do it. I'm not sure how much credit not having consciously decided is worth, though, because that seems to just reward people for not thinking very hard about what they're doing, and they did it from a position of authority and (thus) responsibility.

I stand by the use of the word 'plundering'- it's surprising how some people are willing to hum and har about maybe it being worth it, when doing it deliberately would be a very short-sighted, destroy-the-future-for-money-now act. It calls for such a strong term. And I stand by the position that it would throw out actually having discourse on effectiveness if people played those sorts of games, withheld information that would be bad for causes they think are good, etc, rather than being scrupulously honest. But again to say they 'decided' to do those things is perhaps not entirely right.

I think in an evaluator, which is in a sense a watchdog for other peoples' claims, these kind of things really are pretty serious- it would be scandalous if e.g. GiveWell were found to have been overexcited about something and ignored issues with it on this level. Their job is to curb enthusiasm, not just be another advocate. So I think taking it seriously is pretty called for. As I mentioned in a comment below, though, maybe part of the problem is that EA people tried to take ACE as a more robust evaluator than it was actually intending to be, and the consequence should be that they shift to regarding it as a source for pointers whose own statements are to be taken with a large grain of salt, the way individual charity statements are.

Comment author: Peter_Hurford  (EA Profile) 11 January 2017 08:03:33PM 9 points [-]

I'm involved with ACE as a board member and independent volunteer researcher, but I speak for myself. I agree with you that the leafleting complaints are legitimate -- I've been advocating more skepticism toward the leafleting numbers for years. But I feel like it's pretty harsh to think ACE needs to be entirely replaced.

I don't know if it's helpful, but I can promise you that there's no intentional PR campaign on behalf of ACE to over-exaggerate in order to grow the movement. All I see is an overworked org with insufficient resources to double check all the content on their site.

Judging the character of the ACE staff through my interactions with them, I don't think there was any intent to mislead on leaflets. I'd put it more as negligence arising from over-excitement from the initial studies (despite lots of methodological flaws), insufficient skepticism, and not fully thinking through how things would be interpreted (the claim that leafleting evidence is the strongest among AR is technically true). The one particular sentence, among the thousands on the site, went pretty much unnoticed until Harrison brought it up.

Comment author: JBeshir 11 January 2017 10:21:28PM 7 points [-]

Thanks for the feedback, and I'm sorry that it's harsh. I'm willing to believe that it wasn't conscious intent at publication time at least.

But it seems quite likely to me from the outside that if they thought the numbers were underestimating they'd have fixed them a lot faster, and unless that's not true it's a pretty severe ethics problem. I'm sure it was a matter of "it's an error that's not hurting anyone because charity is good, so it isn't very important", or even just a generic motivation problem in volunteering to fix it, some kind of rationalisation that felt good rather than "I'm going to lie for the greater good"- the only people advocating that outright seem to be other commenters- but it's still a pretty bad ethics issue for an evaluator to succumb to the temptation to defer an unfavourable update.

I think some of this might be that the EA community was overly aggressive in finding them and sort of treating them as the animal charity GiveWell, because EA wanted there to be one, when ACE weren't really aiming to be that robust. A good, robust evaluator's job should be to screen out bad studies and to examine other peoples' enthusiasm and work out how grounded it was, with transparent handling of errors (GiveWell does updates that discuss them and such) and updating in response to new information, and from that perspective taking a severely poor study at face value and not correcting it for years, resulting in a large number of people getting wrong valuations was a pretty huge failing. Making "technically correct" but very misleading statements which we'd view poorly if they came from a company advertising itself is also very bad in an organisation whose job is basically to help you sort through everyone else's advertisements.

Maybe the sensible thing for now is to assume that there is no animal charity evaluator that's good enough to safely defer to, and all there are are people who may point you to papers which caveat emptor, you have to check yourself, for now.

Comment author: JBeshir 11 January 2017 07:47:21PM 3 points [-]

Copying my post from the Facebook thread:

Some of the stuff in the original post I disagree on, but the ACE stuff was pretty awful. Animal advocacy in general has had severe problems with falling prey to the temptation to exaggerate or outright lie for a quick win today. especially about health, and it's disturbing that apparently the main evaluator for the animal rights wing of the EA movement has already decided to join it and throw out actually having discourse on effectiveness in favour of plundering their reputation for more donations today. A mistake is a typo, or leaving something up accidentally, or publishing something early by accident, and only mitigation if corrective action was taken once detected. This was at the minimum negligence, but given that it's been there for years without making the trivial effort to fix it should probably be regarded as just a lie. ACE needs replacing with a better and actually honest evaluator.

One of the ways this negatively impacted the effectiveness discourse: During late 2015 there was an article written arguing for ethical offsetting of meat eating (http://slatestarcodex.com/.../vegetarianism-for-meat-eaters/), but it used ACE's figures, and so understated the amounts people needed to donate by possibly multiple orders of magnitude.

More concerning is the extent to which the (EDIT: Facebook) comments on this post and the previously cited ones go ahead and justify even deliberate lying, "Yes, but hypothetically lying might be okay under some circumstances, like to save the world, and I can't absolutely prove it's not justified here, so I'm not going to judge anyone badly for lying", as with Bryd's original post as well. The article sets out a pretty weak case for "EA needs stronger norms against lying" aside for the animal rights wing, but the comments basically confirm it.

I know that answering "How can we build a movement that matches religious movements in output (http://lesswrong.com/.../can_humanism_match_religions.../), how can we grow and build effectiveness, how can we coordinate like the best, how can we overcome that people think that charity is a scam?" with "Have we considered /becoming pathological liars/? I've not proven it can't work, so let's assume it does and debate from there" is fun and edgy, but it's also terrible.

I can think of circumstances where I'd void my GWWC pledge; if they ever pulled any of this "lying to get more donations" stuff, I'd stick with TLYCS and a personal commitment but leave their website.