Hide table of contents

A motivating scenario could be: imagine you are trying to provide examples to help convince a skeptical friend that it is in fact possible to positively change the long-run future by actively seeking and pursuing opportunities to reduce existential risk.

Examples of things that are kind of close but miss the mark

  • There are probably decent historical examples where people reduced existential risk but where thoes people didn't really have longtermist-EA-type motivations (maybe more "generally wanting to do good" plus "in the right place at the right time")
  • There are probably meta-level things that longtermist EA community members can take credit for (e.g. "get lots of people to think seriously about reducing x risk"), but these aren't very object-level or concrete

118

0
0

Reactions

0
0
New Answer
New Comment

4 Answers sorted by

A lot of longtermist effort is going into AI safety at the moment. I think it's hard to make the case that something in AI safety has legibly or concretely reduced AI risk, since (a) the field is still considered quite pre-paradigmatic, (b) the risk comes from systems that are more powerful than the ones we currently have, and (c) even in less speculative fields, research often takes several years before it is shown to legibly help anyone.

But with those caveats in mind, I think:

  1. The community has made some progress in understanding possible risks and threats from advanced AI system. (See DeepMind's review of alignment threat models). 
  2. Interpretability research seems relatively legible. The basic case "we're building powerful models and it would be valuable to understand how they work" makes intuitive sense. There are also several more nuanced ways interpretability research could be helpful (see Neel's longlist of theories for impact).
  3. The fact that most of the major AGI labs have technical safety teams and governance teams seems quite concrete/legible. I'm not sure how much credit should go to the longtermist communities, but I think several of these teams have been inspired/influenced by ideas in the AI safety community. (To be fair, this might just be a case of "get lots of people to think seriously about reducing x-risk", but I think it's a bit more tangible/concrete.)
  4. In AI governance, the structured access approach seems pretty common among major AGI labs (again, a bit unclear how much credit should go to longtermists but my guess is a non-negligble amount). 
  5. In AI governance, some work on reducing misuse risks and recognizing the dual-use nature of AI technologies seems somewhat legible. A lot of people who did this research are now working at major AGI labs, and it seems plausible that they're implementing some of the best practices they suggested (which would be especially legible, though I'm not aware of any specific examples, though this might be because labs keep a lot of this stuff confidential). 

I think it's also easy to make a case that longtermist efforts have increased the x-risk of artificial intelligence, with the money and talent that grew some of the biggest hype machines in AI (Deepmind, OpenAI) coming from longtermist places.  

It's possible that EA has shaved  a couple  counterfactual years off of time to catastrophic AGI, compared to a world where the community wasn't working on it.

Can you say more about which longtermist efforts you're referring to?

I think a case can be made, but I don't think it's an easy (or clear) case.

My current impression is that Yudkowsky & Bostrom's writings about AGI inspired the creation of OpenAI/DeepMind. And I believe FTX invested a lot in Anthropic and OP invested a little bit (in relative terms) into OpenAI. Since then, there have been capabilities advances and safety advances made by EAs, and I don't think it's particularly clear which outweighs.

It seems unclear to me what the sign of these effects are. Like, maybe no one thinks about AGI for decades. Or maybe 3-5 years after Yudkowsky starts thinking about AGI, someone else much less safety-concerned starts thinking about AGI, and we get a world with AGI labs that are much less concerned about safety than status-quo.

I'm not advocating for this position, but I'm using it to illustrate how the case seems far-from-easy. 

Is most of the AI capabilities work here causally downstream of Superintelligence, even if Superintelligence may have been (heavily ?) influenced by Yudkowsky? Both Musk and Altman recommended Superintelligence, altough Altman has also directly said Yudkowsky has accelerated timelines the most:

https://twitter.com/elonmusk/status/495759307346952192?lang=en

https://blog.samaltman.com/machine-intelligence-part-1

https://twitter.com/sama/status/1621621724507938816

If things stayed in the LW/Rat/EA community, that might have been best. If Yudkowsky hadn't written about AI, then there might not be much of an AI safety community at all now (it might just be MIRI quietly hacking away at it, and most of MIRI seems to have given up now), and doom would be more likely, just later. Someone had to write about AI safety publicly to build the community, but writing and promoting a popular book on the topic is much riskier, because you bring it to the attention of uncareful people, including entrepreneurial types.

I guess they might have tried to keep the public writing limited to academia, but the AI community has been pretty dismissive of AI safety, so it might have been too hard to build the community that way.

4
JakubK
Did Superintelligence have a dramatic effect on people like Elon Musk? I can imagine Elon getting involved without it. That involvement might have been even more harmful (e.g. starting an AGI lab with zero safety concerns). Here's one notable quote about Elon (source), who started college over 20 years before Superintelligence: Overall,  causality is multifactorial and tricky to analyze, so concepts like "causally downstream" can be misleading.  (Nonetheless, I do think it's plausible that publishing Superintelligence was a bad idea, at least in 2014.)

Thanks for these!

I think my general feeling on these is that it's hard for me to tell if they actually reduced existential risk. Maybe this is just because I don't understand the mechanisms for a global catastrophe from AI well enough. (e.g. because of this, linking to Neel's longlist of theories for impact was helpful, so thank you for that!)

E.g. my impression is that some people with relevant knowledge seem to think that technical safety work currently can't achieve very much. 

(Hopefully this response isn't too annoying -- I could put in the work to understand the mechanisms for a global catastrophe from AI better, and maybe I will get round to this someday)

The CLTR Future Proof report has influenced UK government policy at the highest levels.

E.g. The UK "National AI Strategy ends with a section on AGI risk, and says that the Office for AI should pay attention to this.

I think that working out how resilient food technologies could be scaled up in a catastrophe such as nuclear winter is legible and concrete, including natural gas (methane) protein, hydrogen protein, greenhouses, seaweed, leaf protein concentrate, fat from petroleum, relocating cool tolerant crops, etc. Indeed, a survey and a poll have indicated that this work has reduced existential risk.

If you think the UN matters, then this seems good:

On September 10th 2021, the Secretary General of the United Nations released a report called “Our Common Agenda”. This report seems highly relevant for those working on longtermism and existential risk, and appears to signal unexpectedly strong interest from the UN. It explicitly uses longtermist language and concepts, and suggests concrete proposals for institutions to represent future generations and manage catastrophic and existential risks.

https://forum.effectivealtruism.org/posts/Fwu2SLKeM5h5v95ww/major-un-report-discusses-existential-risk-and-future

Comments9
Sorted by Click to highlight new comments since:

I think this is a great question. The lack of clear, demonstrable progress in reducing existential risks, and the difficulty of making and demonstrating any progress, makes me very skeptical of longtermism in practice.

I think shifting focus from tractable, measurable issues like global health and development to issues that - while critical - are impossible to reliably affect, might be really bad.

I don't think that a lack of concrete/legible examples of existential risk reduction so far should make us move to other cause areas. 

The main reason is that it might be unsurprising for a movement to take a while to properly get going. I haven't researched this, but it seems unsurprising to me that movements may typically start with a period of increasing awareness / the number of people working in the movement (a period I think we are currently still in), before achieving really concrete wins. The longtermist movement is a new one with mostly young people who have reoriented their careers but generally haven't yet reached senior enough positions to affect real change. 

If you actually buy into the longtermist argument, then why give up now? It seems unreasonable to me to think that we haven't yet achieved concrete change and  that we are very unlikely to ever do so in the future.

I don't think that a lack of concrete/legible examples of existential risk reduction so far should make us move to other cause areas. 

Perhaps not, but if a movement is happy to use estimates like "our X-risk is 17% this century" to justify working on existential risks and call it the most important thing you can do with your life, but cannot measure how their work actually decreases this 17% figure, they should at the very least reconsider whether their approach is achieving their stated goals.

The longtermist movement is a new one with mostly young people who have reoriented their careers but generally haven't yet reached senior enough positions to affect real change. 

I think this is misleading, because:

Longtermism has been part of EA since close to its very beginning, and many senior leaders in EA are longtermists.

It's true that global health as an area is newer than AI safety, but given EA GHD isn't taking credit for things that happened before EA existed, like eradicating smallpox, I don't know if this is actually the "main reason".

If you actually buy into the longtermist argument, then why give up now? It seems unreasonable to me to think that we haven't yet achieved concrete change and  that we are very unlikely to ever do so in the future.

You might buy into the longtermism argument at a general level ("Future lives matter", "the future is large", we can affect the future"), but update about some of the details, such that you think planning for and affecting the far future is much more intractable or premature than you previously thought. Otherwise, are you saying there's nothing that could happen that would change your mind on whether longtermism was a good use of EA resources?

but cannot measure how their work actually decreases this 17% figure, they should at the very least reconsider whether their approach is achieving their stated goals.

I'm not sure how it's even theoretically possible to measure reductions in existential risk. An existential catastrophe is something that can only happen once. Without being able to observe a reduction in incidence of an event I don't think you can "measure" reduction in risk. I do on the other hand think it's fair to say that increasing awareness of existential risk reduces total existential risk, even if I'm not sure by how much exactly.

Longtermism has been part of EA since close to its very beginning, and many senior leaders in EA are longtermists.

I'd imagine concrete/legible actions to reduce existential risk will probably come in the form of policy change and I don't think for the most part EAs have yet entered influential policy positions. Please do say what other actions you would consider to count under concrete/legible though as that is up for interpretation.

It's true that global health as an area is newer than AI safety, but given EA GHD isn't taking credit for things that happened before EA existed, like eradicating smallpox, I don't know if this is actually the "main reason".

Sorry I'm not really sure what you're saying here.

Otherwise, are you saying there's nothing that could happen that would change your mind on whether longtermism was a good use of EA resources?

This is a good question. I think the best arguments against longtermism are:

  •  That longtermism is fanatical and that fanaticism is not warranted
    • On balance I am not convinced by this objection as I don't think longtermism is fanatical and am unsure if fanaticism is a problem. But further research here might sway me.
  • That we might simply be clueless about the impacts of our actions 
    • At the moment I don't think we are, and I think if cluelessness is a big issue it is very likely to be an issue for neartermist cause areas as well, and even EA altogether.

I don't mind admitting that it seems unlikely that I will change my mind on longtermism. If I do, I'd imagine it will be on account of one of the two arguments above. 

I'm not sure how it's even theoretically possible to measure reductions in existential risk. Without being able to observe a reduction in incidence of an event I don't think you can "measure" reduction in risk. 

I disagree - what do you think the likelihood of a civilization ending event from engineered pandemics is, and what do you base this forecast on?

I'd imagine concrete/legible actions to reduce existential risk will probably come in the form of policy change 

What % of longtermist $ and FTEs do you think are being spent on trying to influence policy versus technical or technological solutions? (I would consider many of these as concrete + legible)

Sorry I'm not really sure what you're saying here.

That was me trying to steelman your justification of lack of concrete/legible wins to "longtermism is new" by thinking of clearer ways that longtermism is different to neartermist causes, and that requires looking outside the EA space.

I disagree - what do you think the likelihood of a civilization ending event from engineered pandemics is, and what do you base this forecast on?

As I say I don’t think one can “measure” the probability of existential risk. I think one can estimate it through considered judgment of relevant arguments but I am not inclined to do so and I don’t think anyone else should be so inclined either. Any such probability would be somewhat arbitrary and open to reasonable disagreement. What I am willing to do is say things like “existential risk is non-negligible” and "we can meaningfully reduce it”. These claims are easier to defend and are all we really need to justify working on reducing existential risk.

What % of longtermist $ and FTEs do you think are being spent on trying to influence policy versus technical or technological solutions? (I would consider many of these as concrete + legible)

No idea. Even if the answer is a lot and we haven’t made much progress, this doesn’t lead me away from longtermism. Mainly because the stakes are so high and I think we’re still relatively new to all this so I expect us to get more effective over time, especially as we actually get people into influential policy roles.

That was me trying to steelman your justification of lack of concrete/legible wins to "longtermism is new" by thinking of clearer ways that longtermism is different to neartermist causes, and that requires looking outside the EA space.

This may be because I’m slightly hungover but you’re going to have to ELI5 your point here!

I imagine a lot of relevant stuff could be infohazardous (although that stuff might not do very well on the "legible" criterion) -- if so and if you happen to feel comfortable sharing it with me privately, feel free to DM me about it.

Just out of curiosity, and maybe it'd help readers with answers, could you share why you are interested in this question? 

I think my motivation comes from things to do with: helping with my personal motivation for work on existential risk, helping me form accurate beliefs on the general tractability of work on existential risk, and helping me advocate to other people about the importance of work on existential risk.

Thinking about it maybe it would be pretty great to have someone assemble and maintain a good public list of answers to this question! (or maybe someone did already and I don't know about it)

Curated and popular this week
Relevant opportunities