Comment author: EricHerboso  (EA Profile) 13 January 2017 07:30:50PM *  7 points [-]

I agree: it is indeed reasonable for people to have read our estimates the way they did. But when I said that we don't want others to "get the wrong idea", I'm not claiming that the readers were at fault. I'm claiming that the ACE communications staff was at fault.

Internally, the ACE research team was fairly clear about what we thought about leafleting in 2014. But the communications staff (and, in particular, I) failed to adequately get across these concerns at the time.

Later, in 2015 and 2016, I feel that whenever an issue like leafleting came up publicly, ACE was good about clearly expressing our reservations. But we neglected to update the older 2014 page with the same kind of language that we now use when talking about these things. We are now doing what we can to remedy this, first by including a disclaimer at the top of the older leafleting pages, and second by planning a full update of the leafleting intervention page in the near future.

Per your concern about cost-effectiveness estimates, I do want to say that our research team will be making such calculations public on our Guesstimate page as time permits. But for the time being, we had to take down our internal impact calculator because the way that we used it internally did not match the ways others (like Slate Star Codex) were using it. We were trying to err on the side of openness by keeping it public for as long as we did, but in retrospect there just wasn't a good way for others to use the tool in the way we used it internally. Thankfully, the Guesstimate platform includes upper and lower bounds directly in the presented data, so we feel it will be much more appropriate for us to share with the public.

You said "I think the error was in the estimate rather than in expectation management" because you felt the estimate itself wasn't good; but I hope this makes it more clear that we feel that the way we were internally using upper and lower bounds was good; it's just that the way we were talking about these calculations was not.

Internally, when we look at and compare animal charities, we continue to use cost effectiveness estimates as detailed on our evaluation criteria page. We intend to publicly display these kinds of calculations on Guesstimate in the future.

As you've said, the lesson should not be for people to trust things others say less in general. I completely agree with this sentiment. Instead, when it comes to us, the lessons we're taking are: (1) communications staff needs to better explain our current stance on existing pages, (2) comm staff should better understand that readers may draw conclusions solely from older pages, without reading our more current thinking on more recently published pages, and (3) research staff should be more discriminating on what types of internal tools are appropriate for public use. There may also be further lessons that can be learned from this as ACE staff continues to discuss these issues internally. But, for now, this is what we're currently thinking.

Comment author: JBeshir 13 January 2017 09:30:39PM 4 points [-]

This all makes sense, and I think it is a a very reasonable perspective. I hope this ongoing process goes well.

Comment author: Ben_West  (EA Profile) 13 January 2017 12:38:47AM 11 points [-]

ACE's primary output is its charity recommendations, and I would guess that it's "top charities" page is viewed ~100x more than the leafleting page Sarah links to.

ACE does not give the "top charity" designation to any organization which focuses primarily on leafleting, and e.g. the page for Vegan Outreach explicitly states that VO is not considered a top charity because of its focus on leafleting and the lack of robust research on that:

We have some concerns that Vegan Outreach has relied too heavily on poor sources of evidence to determine the effectiveness of leafleting as compared to other interventions... Why didn’t Vegan Outreach receive our top recommendation? Although we are impressed with Vegan Outreach’s recent openness to change and their attempts to measure their effectiveness, we still have reservations about their heavy focus on leafleting programs

You are proposing that ACE says negative things on its most prominent pages about leafleting, but left some text buried in a back page that said good things about leafleting as part of a dastardly plot to increase donations to organizations they don't even recommend.

This seems unlikely to me, to put it mildly, but more importantly: it's incredibly important that we assume others are acting in good faith. I disagree with you about this, but I don't think that you are trying to "throw out actually having discourse on effectiveness". This, more than any empirical fact about the likelihood of your hypothesis, is why I think your comment is unhelpful.

Comment author: JBeshir 13 January 2017 10:08:12AM *  5 points [-]

This definitely isn't the kind of deliberate where there's an overarching plot, but it's not distinguishable from the kind of deliberate where a person sees a thing they should do or a reason to not write what they're writing and knowingly ignores it, though I'd agree in that I think it's more likely they flinched away unconsciously.

It's worth noting that while Vegan Outreach is not listed as a top charity it is listed as a standout charity, with their page here: https://animalcharityevaluators.org/research/charity-review/vegan-outreach/

I don't think it is good to laud positive evidence but refer to negative evidence only via saying "there is a lack of evidence", which is what the disclaimers do- in particular there's no mention of the evidence against there being any effect at all. Nor is it good to refer to studies which are clearly entirely invalid as merely "poor" while still relying on their data. It shouldn't be "there is good evidence" when there's evidence for, and "the evidence is still under debate" when there's evidence against, and there shouldn't be a "gushing praise upfront, provisos later" approach unless you feel the praise is still justified after the provisos. And "have reservations" is pretty weak. These are not good acts from a supposedly neutral evaluator.

Until the revision in November 2016, the VO page opened with: "Vegan Outreach (VO) engages almost exclusively in a single intervention, leafleting on behalf of farmed animals, which we consider to be among the most effective ways to help animals.", as an example of this. Even now I don't think it represents the state of affairs well.

If in trying to resolve the matter of whether it has high expected impact or not, you went to the main review on leafleting (https://animalcharityevaluators.org/research/interventions/leafleting/), you'd find it began with "The existing evidence on the impact of leafleting is among the strongest bodies of evidence bearing on animal advocacy methods.".

This is a very central Not Technically a Lie (http://lesswrong.com/lw/11y/not_technically_lying/); the example of a not-technically-a-lie in that post being using the phrase "The strongest painkiller I have." to refer to something with no painkilling properties when you have no painkillers. I feel this isn't something that should be taken lightly:

"NTL, by contrast, may be too cheap. If I lie about something, I realize that I'm lying and I feel bad that I have to. I may change my behaviour in the future to avoid that. I may realize that it reflects poorly on me as a person. But if I don't technically lie, well, hey! I'm still an honest, upright person and I can thus justify visciously misleading people because at least I'm not technically dishonest."

The disclaimer added now helps things, but good judgement should have resulted in an update and correction being transparently issued well before now.

The part which strikes me as most egregious was in the deprioritising of updating a review on what was described in a bunch of places as the most cost effective (and therefore most effective) intervention. I can't see any reason for that, other than that the update would have been negative.

There may not have been conscious intent behind this- I could assume that this was as a result of poor judgement rather than design- but it did mislead the discourse on effectiveness, that already happened, and not as a result of people doing the best thing given information available to them but as a result of poor decisions given this information. Whether it got more donations or not is unclear- it might have tempted more people into offsetting, but on the other hand each person who did offsetting would have paid less because they wouldn't have actually offset themselves.

However something like this is handled is also how a bad actor would be handled, because a bad actor would be indistinguishable from this; if we let this by without criticism and reform, then bad actors would also be let by without criticism and reform.

I think when it comes to responding to some pretty severe stuff of this sort, even if you assume the people made them in good faith and just made some rationality failings, more needs to be said than "mistakes were made, we'll assume you're doing the best you can to not make them again". I don't have a grand theory of how people should react here, but it needs to be more than that.

My inclination is to at the least frankly express how severe I think it is- even if it's not the nicest thing I could say.

Comment author: EricHerboso  (EA Profile) 13 January 2017 01:14:52AM 5 points [-]

Well said, Erika. I'm happy with most of these changes, though I'm sad that we have had to remove the impact calculator in order to ensure others don't get the wrong idea about how seriously such estimates should be taken. Thankfully, Allison plans on implementing a replacement for it at some point using the Guesstimate platform.

For those interested in seeing the exact changes ACE has made to the site, see the disclaimer at the top of the leafleting intervention page and the updates to our mistakes page.

Comment author: JBeshir 13 January 2017 09:55:36AM *  2 points [-]

Thank you for the response, and I'm glad that it's being improved, and that there seems to be a honest interest in doing better.

I feel "ensure others don't get the wrong idea about how seriously such estimates should be taken" is understating things- it should be reasonable for people to ascribe some non-zero level of meaning to issued estimates, and especially it should be that using them to compare between charities doesn't lead you massively astray. If it's "the wrong idea" to look at an estimate at all, because it isn't the true best reasoned expectation of results the evaluator has, I think the error was in the estimate rather than in expectation management, and find the deflection of responsibility here to the people who took ACE at all seriously concerning.

The solution here shouldn't be for people to trust things others say less in general.

Compare, say, GiveWell's analysis of LLINs (http://www.givewell.org/international/technical/programs/insecticide-treated-nets#HowcosteffectiveisLLINdistribution); it's very rough and the numbers shouldn't be assumed to be close to right (and responsibly, they describe all this), but their methodology makes them viable for comparison purposes.

Cost-effectiveness is important- it is the measure of where putting your money does the most good and how much good you can expect to do, and a fully inclusive of risks and data issues cost effectiveness estimate is basically what one is arriving at when one determines what is effective. Even if you use other selection strategies for top charities, incorrect cost effectiveness estimates are not good.

Comment author: Fluttershy 12 January 2017 04:24:29AM 9 points [-]

I should add that I'm grateful for the many EAs who don't engage in dishonest behavior, and that I'm equally grateful for the EAs who used to be more dishonest, and later decided that honesty was more important (either instrumentally, or for its own sake) to their system of ethics than they'd previously thought. My insecurity seems to have sadly dulled my warmth in my above comment, and I want to be better than that.

Comment author: JBeshir 12 January 2017 02:06:22PM 2 points [-]

I find it difficult to combine "I want to be nice and sympathetic and forgiving of people trying to be good people and assume everyone is" with "I think people are not taking this seriously enough and want to tell you how seriously it should be taken". It's easier to be forgiving when you can trust people to take it seriously.

I've kind of erred on the side of the latter today, because "no one criticises dishonesty or rationalisation because they want to be nice" seems like a concerning failure mode, but it'd be nice if I were better at combining both.

Comment author: JBeshir 12 January 2017 02:01:50PM 10 points [-]

One very object-level thing which could be done to make longform, persistent, not hit-and-run discussion in this particular venue easier: Email notifications of comments to articles you've commented in.

There doesn't seem to be a preference setting for that, and it doesn't seem to be default, so it's only because I remember to come check here repeatedly that I can reply to things. Nothing is going to be as good at reaching me as Facebook/other app notifications on my phone, but email would do something.

Comment author: Ben_West  (EA Profile) 12 January 2017 01:34:42AM 7 points [-]

apparently the main evaluator for the animal rights wing of the EA movement has already decided to join it and throw out actually having discourse on effectiveness in favour of plundering their reputation for more donations

This seems like an exaggerated and unhelpful thing to say.

Comment author: JBeshir 12 January 2017 01:46:17PM 1 point [-]

Perhaps. It's certainly what the people suggesting that deliberate dishonesty would be okay are suggesting, and it is what a large amount of online advocacy does, and it is in effect what they did, but they probably didn't consciously decide to do it. I'm not sure how much credit not having consciously decided is worth, though, because that seems to just reward people for not thinking very hard about what they're doing, and they did it from a position of authority and (thus) responsibility.

I stand by the use of the word 'plundering'- it's surprising how some people are willing to hum and har about maybe it being worth it, when doing it deliberately would be a very short-sighted, destroy-the-future-for-money-now act. It calls for such a strong term. And I stand by the position that it would throw out actually having discourse on effectiveness if people played those sorts of games, withheld information that would be bad for causes they think are good, etc, rather than being scrupulously honest. But again to say they 'decided' to do those things is perhaps not entirely right.

I think in an evaluator, which is in a sense a watchdog for other peoples' claims, these kind of things really are pretty serious- it would be scandalous if e.g. GiveWell were found to have been overexcited about something and ignored issues with it on this level. Their job is to curb enthusiasm, not just be another advocate. So I think taking it seriously is pretty called for. As I mentioned in a comment below, though, maybe part of the problem is that EA people tried to take ACE as a more robust evaluator than it was actually intending to be, and the consequence should be that they shift to regarding it as a source for pointers whose own statements are to be taken with a large grain of salt, the way individual charity statements are.

Comment author: Peter_Hurford  (EA Profile) 11 January 2017 08:03:33PM 9 points [-]

I'm involved with ACE as a board member and independent volunteer researcher, but I speak for myself. I agree with you that the leafleting complaints are legitimate -- I've been advocating more skepticism toward the leafleting numbers for years. But I feel like it's pretty harsh to think ACE needs to be entirely replaced.

I don't know if it's helpful, but I can promise you that there's no intentional PR campaign on behalf of ACE to over-exaggerate in order to grow the movement. All I see is an overworked org with insufficient resources to double check all the content on their site.

Judging the character of the ACE staff through my interactions with them, I don't think there was any intent to mislead on leaflets. I'd put it more as negligence arising from over-excitement from the initial studies (despite lots of methodological flaws), insufficient skepticism, and not fully thinking through how things would be interpreted (the claim that leafleting evidence is the strongest among AR is technically true). The one particular sentence, among the thousands on the site, went pretty much unnoticed until Harrison brought it up.

Comment author: JBeshir 11 January 2017 10:21:28PM 7 points [-]

Thanks for the feedback, and I'm sorry that it's harsh. I'm willing to believe that it wasn't conscious intent at publication time at least.

But it seems quite likely to me from the outside that if they thought the numbers were underestimating they'd have fixed them a lot faster, and unless that's not true it's a pretty severe ethics problem. I'm sure it was a matter of "it's an error that's not hurting anyone because charity is good, so it isn't very important", or even just a generic motivation problem in volunteering to fix it, some kind of rationalisation that felt good rather than "I'm going to lie for the greater good"- the only people advocating that outright seem to be other commenters- but it's still a pretty bad ethics issue for an evaluator to succumb to the temptation to defer an unfavourable update.

I think some of this might be that the EA community was overly aggressive in finding them and sort of treating them as the animal charity GiveWell, because EA wanted there to be one, when ACE weren't really aiming to be that robust. A good, robust evaluator's job should be to screen out bad studies and to examine other peoples' enthusiasm and work out how grounded it was, with transparent handling of errors (GiveWell does updates that discuss them and such) and updating in response to new information, and from that perspective taking a severely poor study at face value and not correcting it for years, resulting in a large number of people getting wrong valuations was a pretty huge failing. Making "technically correct" but very misleading statements which we'd view poorly if they came from a company advertising itself is also very bad in an organisation whose job is basically to help you sort through everyone else's advertisements.

Maybe the sensible thing for now is to assume that there is no animal charity evaluator that's good enough to safely defer to, and all there are are people who may point you to papers which caveat emptor, you have to check yourself, for now.

Comment author: JBeshir 11 January 2017 07:47:21PM 3 points [-]

Copying my post from the Facebook thread:

Some of the stuff in the original post I disagree on, but the ACE stuff was pretty awful. Animal advocacy in general has had severe problems with falling prey to the temptation to exaggerate or outright lie for a quick win today. especially about health, and it's disturbing that apparently the main evaluator for the animal rights wing of the EA movement has already decided to join it and throw out actually having discourse on effectiveness in favour of plundering their reputation for more donations today. A mistake is a typo, or leaving something up accidentally, or publishing something early by accident, and only mitigation if corrective action was taken once detected. This was at the minimum negligence, but given that it's been there for years without making the trivial effort to fix it should probably be regarded as just a lie. ACE needs replacing with a better and actually honest evaluator.

One of the ways this negatively impacted the effectiveness discourse: During late 2015 there was an article written arguing for ethical offsetting of meat eating (http://slatestarcodex.com/.../vegetarianism-for-meat-eaters/), but it used ACE's figures, and so understated the amounts people needed to donate by possibly multiple orders of magnitude.

More concerning is the extent to which the (EDIT: Facebook) comments on this post and the previously cited ones go ahead and justify even deliberate lying, "Yes, but hypothetically lying might be okay under some circumstances, like to save the world, and I can't absolutely prove it's not justified here, so I'm not going to judge anyone badly for lying", as with Bryd's original post as well. The article sets out a pretty weak case for "EA needs stronger norms against lying" aside for the animal rights wing, but the comments basically confirm it.

I know that answering "How can we build a movement that matches religious movements in output (http://lesswrong.com/.../can_humanism_match_religions.../), how can we grow and build effectiveness, how can we coordinate like the best, how can we overcome that people think that charity is a scam?" with "Have we considered /becoming pathological liars/? I've not proven it can't work, so let's assume it does and debate from there" is fun and edgy, but it's also terrible.

I can think of circumstances where I'd void my GWWC pledge; if they ever pulled any of this "lying to get more donations" stuff, I'd stick with TLYCS and a personal commitment but leave their website.