Comment author: Michael_PJ 14 October 2017 09:44:53PM 0 points [-]

This is great stuff! Really appreciate the effort you put into measuring things.

Comment author: Michael_PJ 14 October 2017 09:41:40PM 4 points [-]

Thanks for this, detailed post-mortems like this are very valuable!

Some thoughts:

  1. I considered getting involved in the project, but was somewhat put off by the messaging. Somehow it came across as a "learning exercise for students" rather than "attempt to do actually new research". Not sure exactly why that was (the grant size may have been a part, see below), and I now regret not getting more involved.

  2. You describe the grant amount of £10,000 as "substantial". This is surprising to me, since my reaction to the grant size was that it was too small to bother with. I think this corroborates your thoughts about grant size: any size of grant would have had most of the beneficial effects that you saw, but a much larger grant would have been needed to make it seem really "serious".

  3. I think that the project goal was too ambitious. Global prioritization is much harder than more restricted prioritization, but also vaguer and more abstract. Usually when we're learning to deal with vague and abstract problems we start out by becoming very adept with simple, concrete versions to build skills and intuitions before moving up the abstraction hierarchy (easier, better feedback, more motivating, etc.). If I wanted to train up some prioritization researchers I would probably start by getting them to just do lots of small, concrete prioritization tasks.

  4. As Michael Plant says below, I think the project was in a bit of an awkward middle ground. The costs of participation (in terms of work and "top-of-mind" time) were perhaps a bit too high for either students or otherwise-busy community members (like myself), and the perceived benefits (in terms of expected quality of research produced) were perhaps too low for the professionals. (To elaborate on why engaging felt like it would be substantial work for me: in order to provide good commentary on one of your posts, I would have had to: read the post; probably read some prior posts; think hard about it; possibly do some research myself; condense that into a thoughtful reply. That could easily take up an evening of my time, for not a huge perceived reward.) I think your suggestion of running such a project as a week-long retreat is a good idea - it would get a committed block of time from people, and prevents inefficiencies due to repeated time spent "re-loading" the background information.

  5. Agree that quantitative modelling is great and under-utilised. I think a course which was more or less How To Measure Anything applied to EA with modern techniques and technologies would be a fantastic starter for prioritization research, and give people generally useful skills too.

  6. I would have preferred less, higher-quality output from the project. My reaction to the first few blog posts was that they were fine but not terribly interesting, which meant I largely didn't read much of the rest of the content until the models started appearing, which I did find interesting.

  7. Even if you think the project was net-negative, I hope this doesn't put you off starting new things. Exploration is very valuable, even if the median case is a failure.

Comment author: Michael_PJ 30 September 2017 11:15:47AM 3 points [-]

Interesting! Is there a plan to evaluate the grant projects after they reach some kind of "completion" point?

In response to The Turing Test
Comment author: Michael_PJ 17 September 2017 07:17:52PM 1 point [-]

Is there any way to make it available without using iTunes?

Comment author: MichaelPlant 17 August 2017 01:53:57PM 4 points [-]

This is sort of a meta-comment, but there's loads of important stuff here, each of which could have it's own thread. Could I suggest someone (else), organises a (small) conference to discuss some of these things?

I've got quite a few things to add on the ITN framework but nothing I can say in a few words. Relatedly, I've also been working on a method for 'cause search' - a ways of finding all the big causes in a given domain - which is the step before cause prio, but that's not something I can write out succinctly either (yet, anyway).

Comment author: Michael_PJ 17 August 2017 05:56:12PM 2 points [-]

I have a lot of thoughts on cause search, but possibly at a more granular level. One of the big challenges when you switch from an assessing to a generating perspective is finding the right problems to work on, and it's not easy at all.

Comment author: Gregory_Lewis 21 July 2017 12:28:45PM *  0 points [-]

Mea culpa. I was naively thinking of super-imposing the 'previous' axes. I hope the underlying worry still stands given the arbitrarily many sets of mathematical objects which could be reversibly mapped onto phenomenological states, but perhaps this betrays a deeper misunderstanding.

Comment author: Michael_PJ 22 July 2017 12:03:09AM 0 points [-]

If they're isomorphic, then they really are the same for mathematical purposes. Possibly if you view STV as having a metaphysical component then you incur some dependence on philosophy of mathematics to say what a mathematical structure is, whether isomorphic structures are distinct, etc.

Comment author: Michael_PJ 21 July 2017 11:58:20PM 5 points [-]

Interesting that you mention the "waterfall"/"bag of popcorn" argument against computationalism in the same article as citing Scott Aaronson, since he actually gives some arguments against it (see section 6 of In particular, he suggests that we can argue that a process P isn't contributing any computation when having a P-oracle doesn't let you solve the problem faster.

I don't think this fully lays to rest the question of what things are performing computations, but I think we can distinguish them in some ways, which makes me hopeful that there's an underlying distinction.

There's always going to be a huge epistemic problem, of course. The homomorphic encryption shows that there will always be computations that we can't distinguish from noise (I just wrote a blog post about this - curse Scott and his beating me to the punch by years). But I think we can reasonably expect such things to be rare in nature.

Comment author: Julia_Wise 10 July 2017 01:19:51PM 7 points [-]

Ben's right that we're in the process of updating the GWWC website to better reflect our cause-neutrality.

Comment author: Michael_PJ 12 July 2017 09:43:49AM 1 point [-]

Hm, I'm a little sad about this. I always thought that it was nice to have GWWC presenting a more "conservative" face of EA, which is a lot easier for people to get on board with.

But I guess this is less true with the changes to the pledge - GWWC is more about the pledge than about global poverty.

That does make me think that there might be space for an EA org that explicitly focussed on global poverty. Perhaps GiveWell already fills this role adequately.

Comment author: Michael_PJ 25 April 2017 10:51:38PM 6 points [-]

This looks pretty similar to a model I wrote with Nick Dunkley way back in the 2012 (part 1, part 2). I still stand by that as a reasonable stab at the problem, so I also think your model is pretty reasonable :)

Charity population:

You're assuming a fixed pool of charities, which makes sense given the evidence gathering strategy you've used (see below). But I think it's better to model charities as an unbounded population following the given distribution, from which we can sample.

That's because we do expect new opportunities to arise. And if we believe that the distribution is heavy-tailed, a large amount of our expected value may come from the possibility of eventually finding something way out in the tails. In your model we only ever get N opportunities to get a really exceptional charity - after that we are just reducing our uncertainty. I think we want to model the fact that we can keep looking for things out in the tails, even if they maybe don't exist yet.

I do think that a lognormal is a sensible distribution for charity effectiveness. The real distribution may be broader, but that just makes your estimate more conservative, which is probably fine. I just did the boring thing and used the empirical distribution of the DCP intervention cost-effectivenss (note: interventions, not charities).

Evidence gathering strategy:

You're assuming that the evaluator does a lot of evaluating: they evaluate every charity in the pool in every round. In some sense I suppose this is true, in that charities which are not explicitly "investigated" by an evaluator can be considered to have failed the first test by not being notable enough to even be considered. However, I still think this is somewhat unrealistic and is going to drive diminishing returns very quickly, since we're really just waiting for the errors for the various charities settle down so that the best charity becomes apparent.

I modelled this as the process as the evaluator sequentially evaluating a single charity, chosen at random (with replacement). This is also unrealistic, because in fact an evaluator won't waste their time with things that are obviously bad, but even with this fairly conservative strategy things turned out pretty well.

I think it's interesting to think what happens when model the pool more explicitly, and consider strategies like investigating the top recommendation further to reduce error.

Increasing scale with money moved:

Charity evaluators have the wonderful feature that their effectiveness scales more or less linearly with the amount of money they move (assuming that the money all goes to their top pick). This is a pretty great property, so worth mentioning.

The big caveat there is room for more funding, or saturation of opportunities. I'm not sure how best to model this. We could model charities as rather "deposits" of effectiveness that are of a fixed size when discovered, and can be exhausted. I don't know how that would change things, but I'd be interested to see! In particular, I suspect it may be important how funding capacity co-varies with effectiveness. If we find a charity with a cost-effectiveness that's 1000x higher than our best, but it can only take a single dollar, then that's not so great.

Comment author: RyanCarey 23 April 2017 11:08:18PM *  19 points [-]

Some feedback on your feedback (I've only quickly read your post once, so take it with a grain of salt):

  • I think that this is more discursive than it needs to be. AFAICT, you're basically arguing that you think that decision-making and trust in the EA movement is a over-concentrated in OpenPhil.
  • If it was a bit shorter, then it would also be easier to run it by someone involved with OpenPhil, which prima facie would be at least worth trying, in order to correct any factual errors.
  • It's hard to do good criticism, but starting out with long explanations of confidence games and Ponzi schemes is not something that makes the criticism likely to be well-received. You assert that these things are not necessarily bad, so why not just zero in on the thing that you think is bad in this case?
  • So maybe this could have been split into two posts?
  • Maybe there are more upsides of having somewhat concentrated decision-making than you lead on? Perhaps cause prioritization will be better? Since EA funds is a movement-wide scheme, perhaps reputational trust is extra important here, and the diversification would come from elsewhere? Perhaps the best decision-makers will naturally come to work on this full-time.

You may still be right, though I would want some more balanced analysis.

Comment author: Michael_PJ 24 April 2017 10:53:05PM 2 points [-]

I found the analogy with confidence games thought-provoking, but it could have been a bit shorter.

View more: Next