New & upvoted

Customize feedCustomize feed
· 5d ago · 9m read

Posts tagged community

Quick takes

Show community
View more
I think some of the AI safety policy community has over-indexed on the visual model of the "Overton Window" and under-indexed on alternatives like the "ratchet effect," "poisoning the well," "clown attacks," and other models where proposing radical changes can make you, your allies, and your ideas look unreasonable. I'm not familiar with a lot of systematic empirical evidence on either side, but it seems to me like the more effective actors in the DC establishment overall are much more in the habit of looking for small wins that are both good in themselves and shrink the size of the ask for their ideal policy than of pushing for their ideal vision and then making concessions. Possibly an ideal ecosystem has both strategies, but it seems possible that at least some versions of "Overton Window-moving" strategies executed in practice have larger negative effects via associating their "side" with unreasonable-sounding ideas in the minds of very bandwidth-constrained policymakers, who strongly lean on signals of credibility and consensus when quickly evaluating policy options, than the positive effects of increasing the odds of ideal policy and improving the framing for non-ideal but pretty good policies. In theory, the Overton Window model is just a description of what ideas are taken seriously, so it can indeed accommodate backfire effects where you argue for an idea "outside the window" and this actually makes the window narrower. But I think the visual imagery of "windows" actually struggles to accommodate this -- when was the last time you tried to open a window and accidentally closed it instead? -- and as a result, people who rely on this model are more likely to underrate these kinds of consequences. Would be interested in empirical evidence on this question (ideally actual studies from psych, political science, sociology, econ, etc literatures, rather than specific case studies due to reference class tennis type issues).
Quick poll [✅ / ❌]: Do you feel like you don't have a good grasp of Shapley values, despite wanting to?  (Context for after voting: I'm trying to figure out if more explainers of this would be helpful. I still feel confused about some of its implications, despite having spent significant time trying to understand it)
Trump recently said in an interview ( that he would seek to disband the White House office for pandemic preparedness. Given that he usually doesn't give specifics on his policy positions, this seems like something he is particularly interested in. I know politics is discouraged on the EA forum, but I thought I would post this to say: EA should really be preparing for a Trump presidency. He's up in the polls and IMO has a >50% chance of winning the election. Right now politicians seem relatively receptive to EA ideas, this may change under a Trump administration.
Excerpt from the most recent update from the ALERT team:   Highly pathogenic avian influenza (HPAI) H5N1: What a week! The news, data, and analyses are coming in fast and furious. Overall, ALERT team members feel that the risk of an H5N1 pandemic emerging over the coming decade is increasing. Team members estimate that the chance that the WHO will declare a Public Health Emergency of International Concern (PHEIC) within 1 year from now because of an H5N1 virus, in whole or in part, is 0.9% (range 0.5%-1.3%). The team sees the chance going up substantially over the next decade, with the 5-year chance at 13% (range 10%-15%) and the 10-year chance increasing to 25% (range 20%-30%).   their estimated 10 year risk is a lot higher than I would have anticipated.
This is an interesting #OpenPhil grant. $230K for a cyber threat intelligence researcher to create a database that tracks instances of users attempting to misuse large language models.  Will user data be shared with the user's permission? How will an LLM determine the intent of the user when it comes to differentiating between purposeful harmful entries versus user error, safety testing, independent red-teaming, playful entries, etc. If a user is placed on the database, is she notified? How long do you stay in LLM prison?  I did send an email to OpenPhil asking about this grant, but so far I haven't heard anything back.

Popular comments

Recent discussion

About a week ago, Spencer Greenberg and I were debating what proportion of Effective Altruists (EAs) believe enlightenment is real. Since he has a large audience on platform X, we thought a poll would be a good way to increase our confidence in our predictions

Before I share my commentary, I think in hindsight it would have been better to ask the question like this: 'Do you believe that awakening/enlightenment (which frees a person from most or all suffering for extended periods, like weeks at a time) is a real phenomenon that some people achieve (e.g., through meditation)?' I'm sure there are even better ways to phrase the question.

I'm sure there are still better ways of framing the question.

Anyway, the results are below and I find them strange.

Here's why I find them strange:

  • Many EAs believe enlightenment is real.
  • Many EAs are highly focused on reducing suffering.
  • Nobody is really talking
Continue reading
RedStateBlueState posted a Quick Take 15m ago

Trump recently said in an interview ( that he would seek to disband the White House office for pandemic preparedness. Given that he usually doesn't give specifics on his policy positions, this seems like something he is particularly interested in.

I know politics is discouraged on the EA forum, but I thought I would post this to say: EA should really be preparing for a Trump presidency. He's up in the polls and IMO has a >50% chance of winning the election. Right now politicians seem relatively receptive to EA ideas, this may change under a Trump administration.

Continue reading
Caruso posted a Quick Take 23m ago

This is an interesting #OpenPhil grant. $230K for a cyber threat intelligence researcher to create a database that tracks instances of users attempting to misuse large language models.

 Will user data be shared with the user's permission? How will an LLM determine the intent of the user when it comes to differentiating between purposeful harmful entries versus user error, safety testing, independent red-teaming, playful entries, etc. If a user is placed on the database, is she notified? How long do you stay in LLM prison? 

I did send an email to OpenPhil asking about this grant, but so far I haven't heard anything back.

Continue reading
Sign up for the Forum's email digest
You'll get a weekly email with the best posts from the past week. The Forum team selects the posts to feature based on personal preference and Forum popularity, and also adds some announcements and a classic post.
This is a linkpost for

I'm launching AI Lab Watch. I collected actions for frontier AI labs to improve AI safety, then evaluated some frontier labs accordingly.

It's a collection of information on what labs should do and what labs are doing. It also has some adjacent resources, including a list...

Continue reading

Thanks for doing this! This is one of those ideas that I've heard discussed for a while but nobody was willing to go through the pain of actually making the site; kudos for doing so.

Chris Leong
This is a great project idea!

I’ve been working in animal advocacy for two years and have an amateur interest in AI. I’m writing this in a personal capacity, and am not representing the views of my employer. 

Many thanks to everyone who provided feedback and ideas. 


In previous posts...

Continue reading

this is a very helpful post - thank you! I just wanted to make sure you've seen that that Bezos Earth Fund's $100 million AI grand challenge includes alternative proteins as one of three focus areas. 

See here for details: 

GPT-5 training is probably starting around now. It seems very unlikely that GPT-5 will cause the end of the world. But it’s hard to be sure. I would guess that GPT-5 is more likely to kill me than an asteroid, a supervolcano, a plane crash or a brain tumor. We can predict...

Continue reading
yanni kyriacos
Hi Matthew! I'd be curious to hear your thoughts on a couple of questions (happy for you to link if you've posted elsewhere):  1/ What is the risk level above which you'd be OK with pausing AI? 2/ Under what conditions would you be happy to attend a protest? (LMK if you have already attended one!)

What is the risk level above which you'd be OK with pausing AI?

My loose off-the-cuff response to this question is that I'd be OK with pausing if there was a greater than 1/3 chance of doom from AI, with the caveats that:

  • I don't think p(doom) is necessarily the relevant quantity. What matters is the relative benefit of pausing vs. unpausing, rather than the absolute level of risk.
  • "doom" lumps together a bunch of different types of risks, some of which I'm much more OK with compared to others. For example, if humans become a gradually weaker force in the wor
... (read more)
yanni kyriacos
I'd like to make clear to anyone reading that you can support the PauseAI movement right now, only because you think it is useful right now. And then in the future, when conditions change, you can choose to stop supporting the PauseAI movement.  AI is changing extremely fast (e.g. technical work was probably our best bet a year ago, I'm less sure now). Supporting a particular tactic/intervention does not commit you to an ideology or team forever!

Just read this in the Guardian. 

The title is: "‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute"

The sub-headline states: "Nick Bostrom’s centre for studying existential risk warned about AI but also gave rise to cultish...

Continue reading

Quick poll [✅ / ❌]: Do you feel like you don't have a good grasp of Shapley values, despite wanting to? 

(Context for after voting: I'm trying to figure out if more explainers of this would be helpful. I still feel confused about some of its implications, despite having spent significant time trying to understand it)

Continue reading