Hide table of contents

I gave a talk introducing AI alignment / risks from advanced AI in June 2022, aimed at a generally technical audience. However, given how fast AI has been moving, I felt I needed an updated talk. I've made a new one closely based off Richard Ngo's Twitter thread, itself based off of The Alignment Problem from a Deep Learning Perspective. There's still too much text, but these slides are updated through March 2023 and have a more technical lens. 

People are welcome to use this without attribution, and I hope it's useful for any fieldbuilders who want to improve it! I'm also happy to give this talk if people would like me to -- the slides come out to about 45m, with whatever time remaining for discussion. 

New talk slides: The Alignment Problem: Potential Risks from Highly-Capable AI

 

Main thesis slide:


Appendix

Bonus data that I collected after the talk (which was given to AI safety academics) 

Forms response chart. Question title: When (what year) do you think highly advanced AI will exist?. Number of responses: 11 responses.
Forms response chart. Question title: What probability do you put on human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species? (As a percentage between 0-100). Number of responses: 11 responses.
Forms response chart. Question title: How did your beliefs about risks from advanced AI change after this?
. Number of responses: 11 responses.
Forms response chart. Question title: This was _____ on my beliefs surrounding advanced AI.
. Number of responses: 11 responses.

Comments:

  • Great talk! I liked the clear description of the relative resources going into alignment and improvement of capabilities.
  • "Not influential" only because I have already read a lot on this topic :-)
  • Sorry to be a downer I just don’t believe in this stuff, I’m not a materialist
  • I think the alignment problem is one that we, as humans, may not be able to figure out.
Comments4
Sorted by Click to highlight new comments since:

I want to highlight aisafety.training which I think is currently the single most useful link go give to anyone who want's to join the effort of AI Safety research.

Who ever gave me a disagreement vote, I'd be interested to hear why. No pressure though. 

I didn't give a disagreement vote, but I do disagree on aisafety.training being the "single most useful link to give anyone who wants to join the effort of AI Safety research", just because there's a lot of different resources out there and I think "most useful" depends on the audience. I do think it's a useful link, but most useful is a hard bar to meet!

I agree that it's not the most useful link for everyone. I can see how my initial message was ambiguous about this. What I meant is that by all the links I know, I expect this to be most useful on average.

Like, if I meat someone and have a conversation with someone and I had to constrain myself to give them only a single link, I might pick another recourse to give them, based on their personal situation. But if I wrote a post online or gave a talk to a broad audience, and I had to pick only one link to share, it would be this one. 

Curated and popular this week
Relevant opportunities