1

brianwang712 comments on Informational hazards and the cost-effectiveness of open discussion of catastrophic risks - Effective Altruism Forum

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (21)

You are viewing a single comment's thread. Show more comments above.

Comment author: brianwang712 24 June 2018 02:18:10AM 3 points [-]

The unilateralists curse only applies if you expect other people to have the same information as you right?

My understanding is that it applies regardless of whether or not you expect others to have the same information. All it requires is a number of actors making independent decisions, with randomly distributed error, with a unilaterally made decision having potentially negative consequences for all.

You can figure out if they have the same information as you to see if they are concerned about the same things you are. By looking at the mitigation's people are attempting. Altruists should be attempting mitigations in a unilateralist's curse position, because they should expect someone less cautious than them to unleash the information. Or they want to unleash the information themselves and are mitigating the downsides until they think it is safe.

I agree that having dangerous information released by those who are in a position to mitigate the risks is better than having a careless actor releasing that same information –– but I disagree that this is sufficient reason to preemptively release dangerous information. I think a world where everyone follows the logic of "other people are going to release this information anyway but less carefully, so I might as well release it first" is suboptimal compared to a world where everyone follows a norm of reaching consensus before releasing potentially dangerous information. And there are reasons to believe that this latter world isn't a pipe dream; after all, generally when we're thinking about info hazards, those who have access to the potentially dangerous information generally aren't malicious actors, but rather a finite number of, e.g., biology researchers (for biorisks) who could be receptive to establishing norms of consensus.

I'm also not sure how the strategy of "preemptively release, but mitigate" would work in practice. Does this mean release potentially dangerous information, but with the most dangerous parts redacted? Release with lots of safety caveats inserted? How does this preclude the further release of the unmitigated info?

I've not had the best luck reaching out to talk to people about my ideas. I expect that the majority of new ideas will come from people not heavily inside the group and thus less influenced by group think. So you might want to think of solutions that take that into consideration.

I'm not sure I'm fully understanding you here. If you're saying that the majority of potentially dangerous ideas will originate in those who don't know what the unilateralist's curse is, then I agree –– but I think this is just all the more reason to try to spread norms of consensus.

Comment author: WillPearson 24 June 2018 02:13:53PM 2 points [-]

My understanding is that it applies regardless of whether or not you expect others to have the same information. All it requires is a number of actors making independent decisions, with randomly distributed error, with a unilaterally made decision having potentially negative consequences for all.

Information determines the decisions that can be made. For example you can't spread the knowledge of how to create effective nuclear fusion without the information on how to make it.

If there is a single person with the knowledge of how to create safe efficient nuclear fusion they cannot expect other people to release it on their behalf. They may expect it to be net positive but they also expect some downsides and are unsure of whether it will be net good or not. To give a potential downside of nuclear fusion, let us say they are worried about creating excess heat over what the earth can dissipate due to widescale deployment in the world (even if it fixes global warming due to trapping solar energy, it might cause another heat related problem). I forget the technical term for this unfortunately.

The fusion expert(s) cannot expect other people to release this information for them, for as far as they know they are the only people making that exact decision.

I'm also not sure how the strategy of "preemptively release, but mitigate" would work in practice. Does this mean release potentially dangerous information, but with the most dangerous parts redacted? Release with lots of safety caveats inserted? How does this preclude the further release of the unmitigated info?

What the researcher can do is try and build consensus/lobby for a collective decision making body on the internal climate heating (ICH) problem. Planning to release the information when they are satisfied that there is going to be a solution in time for fixing the problem when it occurs.

If they find a greater than expected number of people lobbying for solutions to the ICH problem, then they can expect they are in a unilateralist's curse scenario. And they may want to hold off on releasing information even when they are satisfied with the way things are going (in case there is some other issue they have not thought of).

They can look to see what the other people are doing that have been helping with ICH and see if there other initiatives they are starting, that may or may not be to do with the advent of nuclear fusion.

I think I am also objecting to the expected payoff being thought of as a fixed quantity. You can either learn more about the world to alter your knowledge of the payoff or try and introduce things/insituttions into the world to alter the expected payoff. Building useful institutions may rely on releasing some knowledge, that is where things become more hairy.

I've not had the best luck reaching out to talk to people about my ideas. I expect that the majority of new ideas will come from people not heavily inside the group and thus less influenced by group think. So you might want to think of solutions that take that into consideration.

I'm not sure I'm fully understanding you here. If you're saying that the majority of potentially dangerous ideas will originate in those who don't know what the unilateralist's curse is, then I agree –– but I think this is just all the more reason to try to –– but I think this is just all the more reason to try to spread norms of consensus.

I was suggesting that more norm spreading should be done outwards, keeping it simple and avoiding too much jargon. Is there a presentation of the unilateralist's curse aimed at micro biologists for example?

Also as the the unilaterlist's curse suggests discussing with other people such that they can undertake the information release, sometimes increases the expectation of a bad out come. How should consensus be reached in those situations?

Increasing the number of agents capable of undertaking the initiative also exacerbates the problem: as N grows, the likelihood of someone proceeding incorrectly increases monotonically towards 1.7 The magnitude of this effect can be quite large even for relatively small number of agents. For example, with the same error assumptions as above, if the true value of the initiative V* = -1 (the initiative is undesirable), then the probability of erroneously undertaking the initiative grows rapidly with N, passing 50% for just 4 agents.

Comment author: brianwang712 25 June 2018 03:38:49AM 0 points [-]

If there is a single person with the knowledge of how to create safe efficient nuclear fusion they cannot expect other people to release it on their behalf.

Ah right. I suppose the unilateralist's curse is only a problem insofar as there are a number of other actors also capable of releasing the information; if you are a single actor then the curse doesn't really apply. Although one wrinkle might be considering the unilateralist's curse with regards to different actors through time (i.e., erring on the side of caution with the expectation that other actors in the future will gain access to and might release the information), but coordination in this case might be more challenging.

What the researcher can do is try and build consensus/lobby for a collective decision making body on the internal climate heating (ICH) problem. Planning to release the information when they are satisfied that there is going to be a solution in time for fixing the problem when it occurs.

Thanks, this concrete example definitely helps.

I think I am also objecting to the expected payoff being thought of as a fixed quantity. You can either learn more about the world to alter your knowledge of the payoff or try and introduce things/insituttions into the world to alter the expected payoff. Building useful institutions may rely on releasing some knowledge, that is where things become more hairy.

This makes sense. "Release because the expected benefit is above the expected risk" or "not release because the vice versa is true" is a bit of a false dichotomy, and you're right that we should be more thinking about options that could maximize the benefit while minimizing the risk when faced with info hazards.

Also as the the unilaterlist's curse suggests discussing with other people such that they can undertake the information release, sometimes increases the expectation of a bad out come. How should consensus be reached in those situations?

This can certainly be a problem, and is a reason not to go too public when discussing it. Probably it's best to discuss privately with a number of other trusted individuals first, who also understand the unilateralist's curse, and ideally who don't have the means/authority of releasing the information themselves (e.g., if you have a written up blog post you're thinking of posting that might contain info hazards, then maybe you could discuss in vague terms with other individuals first, without sharing the entire post with them?).

Comment author: WillPearson 25 June 2018 07:12:53PM *  2 points [-]

Ah right. I suppose the unilateralist's curse is only a problem insofar as there are a number of other actors also capable of releasing the information; if you are a single actor then the curse doesn't really apply. Although one wrinkle might be considering the unilateralist's curse with regards to different actors through time (i.e., erring on the side of caution with the expectation that other actors in the future will gain access to and might release the information), but coordination in this case might be more challenging.

Interesting idea. This may be worth trying to develop more fully?

Probably it's best to discuss privately with a number of other trusted individuals first, who also understand the unilateralist's curse,

I'm still coming at this from a lens of "actionable advice for people not in ea". It might be that the person doesn't know many other trusted individuals, what should be the advice then? It would probably also be worth giving advice on how to have the conversation as well. The original article gives some advice on what happens if consensus can't be reached (voting/such like).

As I understand it you shouldn't wait for consensus else you have the unilateralist's curse in reverse. Someone pessimistic about an intervention can block the deployment of an intervention needed to avoid disaster (this seems very possible if you consider crucial considerations flipping signs, rather than just random noise in beliefs in desirability).

Would you suggest discussion and vote (assuming no other courses of action can be agreed upon)? Do you see the need to correct for status quo bias in any way?

This seems very important to get right. I'll think about this some more.

Comment author: brianwang712 27 June 2018 01:39:56PM 1 point [-]

Interesting idea. This may be worth trying to develop more fully?

Yeah. I'll have to think about it more.

I'm still coming at this from a lens of "actionable advice for people not in ea". It might be that the person doesn't know many other trusted individuals, what should be the advice then?

Yeah, for people outside EA I think structures could be set up such that reaching consensus (or at least a majority vote) becomes a standard policy or an established norm. E.g., if a journal is considering a manuscript with potential info hazards, then perhaps it should be standard policy for this manuscript to be referred to some sort of special group consisting of journal editors from a number of different journals to deliberate. I don't think people need to be taught the mathematical modeling behind the unilateralist's curse for these kinds of policies to be set up, as I think people have an intuitive notion of "it only takes one person/group with bad judgment to fuck up the world; decisions this important really need to be discussed in a larger group."

One important distinction is that people who are facing info hazards will be in very different situations when they are within EA vs. when they are out of EA. For people within EA, I think it is much more likely to be the case that a random individual has an idea that they'd like to share in a blog post or something, which may have info hazard-y content. In these situations the advice "talk to a few trusted individuals first" seems to be appropriate.

For people outside of EA, I think those who are in possession of info hazard-y content are much more likely to be embedded in some sort of larger institution (e.g., a research scientist or a journal editor looking to publish something), where perhaps the best leverage is setting up certain policies, rather than trying to teach everyone the unilateralist's curse.

As I understand it you shouldn't wait for consensus else you have the unilateralist's curse in reverse. Someone pessimistic about an intervention can block the deployment of an intervention needed to avoid disaster.

You're right, strict consensus is the wrong prescription. A vote is probably better. I wonder if there's mathematical modeling that you could do that would determine what fraction of votes is optimal, in order to minimize the harms of the standard unilateralist's curse and the curse in reverse? Is it a majority vote? A 2/3s vote? l suspect this will depend on what the "true sign" of releasing the potentially dangerous info is likely to be; the more likely it is to be negative, the higher bar you should be expected to clear before releasing.

Comment author: WillPearson 27 June 2018 07:50:21PM *  0 points [-]

For people outside of EA, I think those who are in possession of info hazard-y content are much more likely to be embedded in some sort of larger institution (e.g., a research scientist or a journal editor looking to publish something), where perhaps the best leverage is setting up certain policies, rather than trying to teach everyone the unilateralist's curse.

There is a growing movement of maker's and citizen scientists that are working on new technologies. It might be worth targeting them somewhat (although again probably without the math). I think the approaches for ea/non-ea seem sensible.

You're right, strict consensus is the wrong prescription. A vote is probably better. I wonder if there's mathematical modeling that you could do that would determine what fraction of votes is optimal, in order to minimize the harms of the standard unilateralist's curse and the curse in reverse? Is it a majority vote? A 2/3s vote? l suspect this will depend on what the "true sign" of releasing the potentially dangerous info is likely to be; the more likely it is to be negative, the higher bar you should be expected to clear before releasing.

I also like to weigh the downside of the lack of releasing the information as well. If you don't release information you are making everyone make marginally worse decisions (if you think someone will release it anyway later). For example in the nuclear fusion example, you think that everyone currently building new nuclear fission stations are wasting their time, that people training on how to manage coal plants should be training on something else etc, etc.

I also have another consideration which is possibly more controversial. I think we need some bias to action, because it seems like we can't go on as we are for too much longer (another 1000 years might be pushing it). The level of resources and coordination towards global problems fielded by the status quo seems insufficient. So it is a default bad outcome.

With this consideration, going back to the fusion pioneers, they might try and find people to tell so that they could increase the bus factor (the number of people that would have to die to lose the knowledge). They wouldn't want the knowledge to get lost (as it would be needed in the long term) and they would want to make sure that whoever they told understood the import and potential downsides of the technology.

Edit: Knowing the sign of an intervention is hard, even after the fact. Consider the invention and spread of the knowledge about nuclear chain reactions. Without it we would probably be burning a lot more fossil fuels, however with it we have the existential risk associated with it. If that risk never pays out, then it may have been a spur towards greater coordination and peace.

I'll try and formalise these thoughts at some point, but I am bit work impaired for a while.

Comment author: turchin 28 June 2018 10:08:13AM -1 points [-]

One more problem with the idea that I should consult my friends first before publishing a text is a "friend' bias": people who are my friends tend to react more positively on the same text than those who are not friends. I personally had a situation when my friends told me that my text is good and non-info-hazardous, but when I presented it to people who didn't know me, their reaction was opposite.

Comment author: turchin 26 June 2018 09:30:03AM 0 points [-]

Sometimes, when I work on a complex problem, I feel as if I become one of the best specialists in it. Surely, I know three other people who are able to understand my logic, but one of them is dead, another is not replying on my emails and the third one has his own vision, affected by some obvious flaw. So none of them could give me correct advice about the informational hazard.