Many people here, myself included, are very concerned about the risks from rapidly improving artificial general intelligence (AGI). A significant fraction of people in that camp give to the Machine Intelligence Research Institute, or recommend others do so.
Unfortunately, for those who lack the necessary technical expertise, this is partly an act of faith. I am in some position to evaluate the arguments about whether safe AGI is an important cause. I'm also in some position to evaluate the general competence and trustworthiness of the people working at MIRI. On those counts I am satisfied, though I know not everyone is.
However, I am in a poor position to evaluate:
- The quality of MIRI's past research output.
- Whether their priorities are sensible or clearly dominated by alternatives.
- Have an existing reputation for trustworthiness and confidentiality.
- Think that AI risk is an important cause, but have no particular convictions about the best approach or organisation for dealing with it. They shouldn't have worked for MIRI in the past, but will presumably have some association with the general rationality or AI community.
- Involve 10-20 people, including a sample of present and past MIRI staff, people at organisations working on related problems (CFAR, FHI, FLI, AI Impacts, CSER, OpenPhil, etc), and largely unconnected math/AI/CS researchers.
- Results should be compiled by two or three people - ideally with different perspectives - who will summarise the results in such a way that nothing in the final report could identify what any individual wrote (unless they are happy to be named). Their goal should be purely to represent the findings faithfully, given the constraints of brevity and confidentiality.
- The survey should ask about:
- Quality of past output.
- Suitability of staff for their roles.
- Quality of current strategy/priorities.
- Quality of operations and other non-research aspects of implementation, etc.
- How useful more funding/staff would be.
- Comparison with the value of work done by other related organisations.
- Suggestions for how the work or strategy could be improved.
- Obviously participants should only comment on what they know about. The survey should link to MIRI's strategy and recent publications.
- MIRI should be able to suggest people to be contacted, but so should the general public through an announcement. They should also have a chance to comment on the survey itself before it goes out. Ideally it would be checked by someone who understand good survey design, as subtle aspects of wording can be important.
- It should be impressed on participants the value of being open and thoughtful in their answers for maximising the chances of solving the problem of AI risk in the long run.
I think that it's probably quite important to define in advance what sorts of results would convince us that the quality of MIRI's performance is either sufficient or insufficient. Otherwise I expect those already committed to some belief about MIRI's performance to consider the survey evidence for their existing belief, even if another person with the opposite belief also considers it evidence for their belief.
Relatedly, I also worry about the uniqueness of the problem and how it might change what we consider a cause worth donating to. Although you don't seem to be thinking that you could understand MIRI's arguments and see no flaws and still be inclined to say "I still can't be sure that this is the right way to go," I expect that many people are averse to donating to causes like MIRI because the effectiveness of the proposed interventions does not admit to simple testing. With existential risks, empirical testing is often impossible in the traditional sense, although sometimes possible in a limited sense. Results about sub-existential pandemic risk are probably at least somewhat relevant to the study of existential pandemic risk, for example. But it's not the same as distributing bed nets, looking at the malaria incidence, adjusting, reobserving, and so on and so on. It's not like we can perform an action, look through a time warp, and see whether or not the world ends in the future. And what I'm getting at is that, even if this is not really the nature of these problems, even if it is not the case that interventions upon these problems are not testable, we might imagine the implications if it were the case that they were genuinely untestable. I think that there are some people who would refuse to donate to existential risk charities merely because other charities have interventions testable for effectiveness. And this concerns me. If it is not by human failing that we don't test the effectiveness of our interventions, but it is the nature of the problem that you cannot test the effectiveness of your interventions, do you choose to do nothing? That is not a rhetorical question. I genuinely believe that we are confused about this and that MIRI is an example of a cause that may be difficult to evaluate without resolving this confusion. This is related to ambiguity aversion in cognitive science and decision theory. Even though ambiguity aversion appears in choices between betting on known and unknown risks, and not in choices to bet or not to bet on unknown risks in non-comparative contexts, effective altruists consider almost all charitable decisions within the context of cause prioritization, which means that we might expect EAs to encounter more comparative contexts than a random philanthropist, and thus for them to exhibit more bias towards causes with ambiguity, even if the survey itself would technically be focusing on one cause. It's noteworthy that the expected utility formalism and human behavior differ in the sense that the expected utility formalism prescribes indifference between bets with known and unknown probabilities in the case that each bet has the same payoffs. (In reality the situation is not even this clear, for the payoffs of successfully intervening upon malaria incidence as opposed to human extinction are hardly equal.) I think we must genuinely ask if we should be averse to ambiguity in general, and to attempt to explain why this heuristic was evolutionarily adaptive, and to see if the problem of existential risk is an example of a case either where we should, or where we should not, use ambiguity aversion as a heuristic. After all, a humanity that attempts no interventions on the problem of existential risk merely because it cannot test the effectiveness of its interventions is a humanity that ignores existential risk and goes extinct for it, even if we believed that we were being virtuous philanthropists the entire time.