Yes I was speaking somewhat loosely. It is nevertheless in my view very implausible that the intervention would sustain its effect for that long - we're talking about the effect of one video here. Do you think the chance of fade-out within a year is less than 10%? What is your median estimate?

Are you talking about the individual level, or the mean? My estimate would be, that for the median individual, the effect will have faded out after at most 6 months. However, the mean might be influenced by the tails quite strongly.

Thinking about it for a bit longer, a mean effect of 12 years does seem quite implausible, though. In the limiting case, where only the tails matter, this would be equivalent to convincing around 25% of the initially influenced students to stop eating pork for the rest of their lives.

The upper bound for my 90% confidence interval for the mean seems to be around 3 years, while the lower bound is at 3 months. The probability mass within the interval is mostly centered to the left.

The assumptions here about the persistence of the effect seem over-optimistic.

You measure the effect after one month and then assume that it will persist for 1 to 12 years (90% CI). So, you assign a less than 10% chance that the effect will fade out within a year. You made this decision "arbitrarily" on the basis of an ACE meta-analysis investigating how long people who say they don't eat meat have not eaten meat without interruption. The first to say is that this is testing a very different population and so is of questionable relevance to the Animal Equality intervention. In the ACE study, the sample is people who say they have made the commitment to be vegetarian. In yours, it is people who have been shown a video who say they haven't eaten pork a month on.

Given that we are working with fairly arbitrary intuitions here, I find it highly surprising that the 90% CI doesn't include fade out of the effect within a year. My *median* estimate is that the effect fades out within a year. I'd be curious to hear what other people think about this.

But you think there is around a 10% chance that the effect will fade out after 12 years. The claim is that there is a 10% chance that being shown an animal advocacy video on one day will have an effect on consumption decisions 12 years down the line. I would put the chance of this at ~0%.

If I am right and a more reasonable estimate of persistence seems to be closer to 6 months (I actually think I'm being conservative here - I'd guess closer to 2-3 months), this suggests you should revise your cost-effectiveness estimate down by an order of magnitude.

The claim does not seem to be exactly, that there is a 10% chance of an animal advocacy video affecting consumption decisions after 12 years for a given individual.

I'd interpret it as: there is a 5% chance of the mean duration of reduction, conditioned on the participant reporting to change their behaviour based on the video being higher than 12 years.

This could for example also be achieved by having a very long term impact on very few participants. This interpretation seems a lot more plausible, although i am not certain at all, wheter that claim correct. Long term follow up data would certainly be very helpful.

For the first point, see my response to Carl above. I think you're right in theory, but in practice it's still a problem.

For the second point, I agree with Flodorner that you would either use the Shapley value, or you would use the probability of changing the outcome, not both. I don't know much about Shapley values, but I suspect I would agree with you that they are suboptimal in many cases. I don't think there is a good theoretical solution besides "consider every possible outcome and choose the best one" which we obviously can't do as humans. Shapley values are one tractable way of attacking the problem without having to think about all possible worlds, but I'm not surprised that there are cases where they fail. I'm advocating for "think about this scenario", not "use Shapley values".

I think the $1bn benefits case is a good example of a pathological case where Shapley values fail horribly (assuming they do what you say they do, again, I don't know much about them).

My overall position is something like "In the real world when we can't consider all possibilities, one common failure mode in impact calculations is the failure to consider the scenario in which *all* the participants who contributed to this outcome instead do other altruistic things with their money".

At this point, i think that to analyze the $1bn case correctly, you'd have to substract everyone's opportunity cost in the calculation of the shapley value (if you want to use it here). This way, the example should yield what we expect.

I might do a more general writeup about shapley values, their advantages, disadvantages and when it makes sense to use them, if i find the time to read a bit more about the topic first.

Here how I would reason about moral weights in this case:

In this case the definition of a "life saved" is pretty different than what normally means. Normally a life saved means 30 to 80 DALYs averted, depending if the intervention is on adults or children. In this case we are talking about potentially thousands of DALYs averted, so a life saved should count more. On the other hand there's also to take into consideration that when saving, for example, children who would have died of malaria, you are also giving them a chance of reaching LEV. It's not a full chance as in the present evaluation, but something probably ranging from 30% to 70%.

Additional consideration: some people may want to consider children more important to save than adults. Introducing age weighting and time discounting could seem reasonable in this case, since even if you save 5000 DALYs you are only saving one person, so you might want to discount DALYs saved later in life. On the other hand there are reasons to disagree with this approach: Saving an old person and guaranteeing him/her to reach LEV means also "saving a library". A vast amount of knowledge and experience, especially future experience would have been otherwise completely destroyed. In fact I am not so sure I would apply time discounting myself for this reason.

Regarding bayesian discounting:

I just read how GiveWell would go about this (https://blog.givewell.org/2011/08/18/why-we-cant-take-expected-value-estimates-literally-even-when-theyre-unbiased/). To account for it I would need a prior distribution (or more than one?). I also have difficulty making the calculation, since Guesstimate doesn't let me calculate the variance of the random variables. I will try with other means... maybe with smaller data sets and proceeding by hand or using online calculators.

I would also like to introduce probability distributions in the whole analysis and turn some arguments made in the explanations of some variables in variables in their own right, and I would like to add some more informations (for example the safety profile and history of metformin and the value of information of the trial) based on feedback I'm receiving. This would mean rewriting many sections though, and this will require time.

For now I put an "Edit" at the beginning in order to warn readers not to take the numbers reached too seriously, but I invited them to delve in some more broadly applicable ideas I presented in the analysis that could be useful for evaluating many interventions in the cause area of aging.

I think, it might be best to just report confidence intervals for your final estimates (guesstimate should give you those). Then everyone can combine your estimates with their own priors on general intervention's effectiveness and thereby potentially correct for the high levels of uncertainty (at least in a crude way by estimating the variance from the confidence intervals).

The variance of X can be defined as E[X^2]-E[X]^2, which should not be hard to implement in Guesstimate. However, i am not sure, whether or not having the variance yields to more accurate updating, than having a confidence interval. Optimally you'd have the full distribution, but i am not sure, whether anyone will actually do the maths to update from there. (But they could get it roughly from your guesstimate model).

I might comment more on some details and the moral assumptions, if i find the time for it soon.

I want to add something: It probably has been discussed before, but it occurs to me that when thinking about prioritisation in general it's almost always better to think at the lowest level possible. That's because the impact per dollar is only evaluable for specific interventions, and because causes that at first don't appear particularly cost effective can hide particular interventions that are. And those particular interventions could be in principle even more cost effective than other interventions in causes that do appear cost effective overall. I think high-level cause prioritisation is mostly good for gaining a first superficial understanding of the promise of a particular class of altruistic interventions.

I disagree. If we are fairly certain, that the average intervention in Cause X is 10 times more effective than the average Intervention in Cause Y (For a comparision, 80000 hours currently believes, that AI-safety work is 1000 times as effective as global health), it seems like we should strongly prioritize Cause X. Even if there are some interventions in Cause Y, which are more effective, than the average intervention in Cause X, finding them is probably as costly as finding the most effective interventions in Cause X (Unless there is a specific reason, why evaluating cost effectiveness in Cause X is especially costly, or the distributions of Intervention effectiveness are radically different between both causes). Depending on how much we can improve on our current comparative estimates of cause effctiveness, the potential impact of doing so could be quite high, since it is essentially multiplies the effects of our lower level prioritization. Therefore it seems, like high to medium level prioritization in combination with low-level prioritization restricted to the best causes seems the way to go. On the other hand, it seems at least plausible, that we cannot improve our high-level prioritization significantly at the moment and should therefore focus on the lower level within the most effective causes.

"The alternative approach (which I argue is wrong) is to say that each of the n A voters is counterfactually responsible for 1/n of the $10bn benefit. Suppose there are 10m A voters. Then each A voterâ€™s counterfactual social impact is 1/10m*$10bn = $1000. But on this approach the common EA view that it is rational for individuals to vote as long as the probability of being decisive is not too small, is wrong. Suppose the ex ante chance of being decisive is 1/1m. Then the expected value of Emma voting is a mere 1/1m*$1000 = $0.001. On the correct approach, the expected value of Emma voting is 1/10m*$10bn = $1000. If voting takes 5 minutes, this is obviously a worthwhile investment for the benevolent voter, as per common EA wisdom."

I am not sure, whether anyone is arguing for discounting twice. The alternative approach using the shapley value would divide the potential impact amongst the contributors, but not additionally account for the probability. Therefore, in this example both approaches seem to assign the same counterfactual impact.

More generally, it seems like most disagreements in this thread could be resolved by a more charitable interpretation of the other side (from both sides, as the validity of your argument against rohinmshah's counterexample seems to show)

Right now, a comment from someone more proficient with the shapley value arguing against

"Also consider the $1bn benefits case outlined above. Suppose that the situation is as described above but my action costs $2 and I take one billionth of the credit for the success of the project. In that case, the Shapely-adjusted benefits of my action would be $1 and the costs $2, so my action would not be worthwhile. I would therefore leave $1bn of value on the table."

might be helpful for a better understanding.

Interesting Analysis! Since you already have confidence intervals for a lot of your models factors, using the guesstimate web tool to get a more detailed idea of the uncertainty in the final estimate might be helpful, since some bayesian discounting based on estimate's uncertainty might be a sensible thing to do. (https://www.lesswrong.com/posts/5gQLrJr2yhPzMCcni/the-optimizer-s-curse-and-how-to-beat-it)

It might also make sense to make your ethical assumptions more explicit in the beginning (https://www.givewell.org/how-we-work/our-criteria/cost-effectiveness/comparing-moral-weights), especially since the case against aging seems to be less intuitive than most of givewells interventions.

I am not sure about whether your usage of economies of scale already covers this, but it seems to make sense to highlight, that what matters is the marginal difference of the money for you and your adversary. If doing evil is a lot more efficient at low scales (Think of distributing highly addictive drugs among vurnerable populations vs. Distributing Malaria nets), your adversary could be hitting diminishing returns already, while your marginal returns increase, and the lottery might still be not be worth it.