TL;DR: In order to prevent x-risks, our strategic vision should outperform technical capabilities of the potential malevolent agents, which means that strategic discussion should be public and open, but the publication of technical dangerous knowledge should be prevented.
Risks and benefits of the open discussion
Bostrom has created a typology of info-hazards, but any information could also have “x-risk prevention positive impact”, or info-benefits. Obviously, info-benefits must outweigh the info-hazards of the open public discussion of x-risks, or the research of the x-risks is useless. In other words, the “cost-effectiveness” of the open discussion of a risk A should be estimated, and the potential increase in catastrophic probability should be weighed against a possible decrease of the probability of a catastrophe.
The benefits of public discussion are rather obvious: If we publicly discuss a catastrophic risk, we can raise awareness about it, prepare for the risk, and increase funding for its prevention. Publicly discussed risks are also more likely to be viewed by a larger group of scientists than those that are discussed by some closed group. Interdisciplinary research and comparison of different risks is impossible if they are secrets of different groups; as a result of this phenomenon, for example, asteroid risks are overestimated (more publicly discussed) and biorisks are underestimated (less-discussed).
A blanket "information hazard counterargument" is too general, as any significant new research on x-risks changes the information landscape. For example, even good news about new prevention methods may be used by bad actors to overcome these methods.
The problem of informational hazards has already been explored in the field of computer security, and they have developed best practices in this area. The protocol calls for the discoverer of an informational hazard to first try to contact the firm which owns the vulnerable software, and later, if there is no reply, to publish it openly (or at least hint at it), to provide users with an advantage over bad actors who may use it secretly.
The relative power of info-hazards depends on the information that has already been published and on other circumstances
Consideration 1. If something is already public knowledge, then discussing it is not an informational hazard. Example: AI risk. The same is true for the “attention hazard”: if something is extensively present now in the public filed, it is less dangerous to discuss it publicly.
Consideration 2: If information X is public knowledge, then similar information X2 is a lower informational hazard. Example: if the genome of a flu virus N1 has been published, publishing similar flu genome N2 has a marginal informational hazard.
Consideration 3. If many info-hazards have already been openly published, the world may be considered saturated with info-hazards, as a malevolent agent already has access to so much dangerous information. In our world, where genomes of the pandemic flus have been openly published, it is difficult to make the situation worse.
Consideration 4: If I have an idea about x-risks in a field in which I don’t have technical expertise and it only took me one day to develop this idea, such an idea is probably obvious to those with technical expertise, and most likely regarded by them as trivial or non-threatening.
Consideration 5: A layman with access only to information available on Wikipedia is not able to generate ideas about really powerful informational hazards, that could not already be created by a dedicated malevolent agent, like the secret service of a rogue country. However, if one has access to unique information, which is typically not available to laymen, this could be an informational hazard.
Consideration 6. Some ideas will be suggested anyway soon, but by speakers who are less interested in risk prevention. For example, ideas similar to Roko’s Basilisk have been suggested independently twice by my friends.
Consideration 7: Suppressing some kinds of information may signal its importance to malevolent agents, producing a “Streisand effect”.
Consideration 8: If there is a secret net to discuss such risks, some people will be excluded from it, which may create undesired social dynamics.
Info-hazard classification
There are several types of catastrophic informational hazard (a more detailed classification can be seen in Bostrom’s article, which covers not only catastrophic info-hazards, but all possible ones):
* Technical dangerous information (genome of a virus),
* Ideas about possible future risks (like deflection of asteroids to Earth)
* Value-related informational hazards (e.g. idea voluntary human extinction movement or fighting overpopulation by creation small catastrophes).
* Attention-related info-hazards – they are closely related to value-hazard, as the more attention gets the idea, the more value humans typically gives to it. the potentially dangerous idea should be discussed in the ways where it more likely attracts the attention of the specialists than public or potentially dangerous agents. This includes special forums, jargon, non-sensational titles, and scientific fora for discussion.
The most dangerous are value-related and technical information. Value-related information could work as a self-replicating meme, and technical information could be used to actually create dangerous weapons, while ideas about possible risks could help us to prepare for such risks or start additional research.
Value of public discussion
We could use human extinction probability change as the only important measure of the effectiveness of any action according to Bostrom’s maxipoc. In that case, the utility of any public statement A is:
V = ∆I(increase of survival probability via better preparedness) – ∆IH(increase of the probability of the x-risk because the bad actors will know it).
Emerging technologies increase the complexity of the future, which in some moment could become chaotic. The more chaotic is the future, the shorter is planning horizon, and the less time we have to act preventively. We need a full picture of the future risks for strategic planning. To have the full picture, we have to openly discuss the risks without going in the technical details.
The reason for it is that we can’t prevent risks which we don’t know, and the prevention strategy should have full list of risks, while malevolent agent may need only technical knowledge of one risk (and such knowledge is already available in the field of biotech, so malevolent agents can’t gain much from our lists).
Conclusion
Society could benefit from the open discussion of possible risks ideas as such discussion could help in the development of general prevention measures, increasing awareness, funding and cooperation. This could also help us to choose priorities in fighting different global risks.
For example, biorisks are less-discussed and thus could be perceived as being less of a threat than the risks of AI. However, biorisks could exterminate humanity before the emergence of superintelligent AI (to prove this argument I would have to present general information which may be regarded as having informational hazard). But the amount of technical hazardous information openly published is much larger in the field of biorisks – exactly because the risk of the field as a whole is underestimated!
If you have a new idea which may appear to be a potential info-hazard, you may need to search the internet to find out if it has already been published – most likely, it is. Then you may privately discuss it with a respected scientist in the field, who also has knowledge of catastrophic risks and ask if the scientist thinks that this idea is really dangerous. The attention hazard should be overcome by non-sensationalist and non-media-attracting methods of analysis.
It is a best practice to add to the description of any info-hazard the ways in which the risk could be overcome, or why the discussion could be used to find approaches for its mitigation.
Literature:
Bostrom “Information Hazards: A Typology of Potential Harms from Knowledge”, 2011. https://nickbostrom.com/information-hazards.pdf
Yampolsky “BEYOND MAD?: THE RACE FOR ARTIFICIAL GENERAL INTELLIGENCE”
https://www.itu.int/en/journal/001/Documents/itu2018-9.pdf
https://wiki.lesswrong.com/wiki/Information_hazard
Information determines the decisions that can be made. For example you can't spread the knowledge of how to create effective nuclear fusion without the information on how to make it.
If there is a single person with the knowledge of how to create safe efficient nuclear fusion they cannot expect other people to release it on their behalf. They may expect it to be net positive but they also expect some downsides and are unsure of whether it will be net good or not. To give a potential downside of nuclear fusion, let us say they are worried about creating excess heat over what the earth can dissipate due to widescale deployment in the world (even if it fixes global warming due to trapping solar energy, it might cause another heat related problem). I forget the technical term for this unfortunately.
The fusion expert(s) cannot expect other people to release this information for them, for as far as they know they are the only people making that exact decision.
What the researcher can do is try and build consensus/lobby for a collective decision making body on the internal climate heating (ICH) problem. Planning to release the information when they are satisfied that there is going to be a solution in time for fixing the problem when it occurs.
If they find a greater than expected number of people lobbying for solutions to the ICH problem, then they can expect they are in a unilateralist's curse scenario. And they may want to hold off on releasing information even when they are satisfied with the way things are going (in case there is some other issue they have not thought of).
They can look to see what the other people are doing that have been helping with ICH and see if there other initiatives they are starting, that may or may not be to do with the advent of nuclear fusion.
I think I am also objecting to the expected payoff being thought of as a fixed quantity. You can either learn more about the world to alter your knowledge of the payoff or try and introduce things/insituttions into the world to alter the expected payoff. Building useful institutions may rely on releasing some knowledge, that is where things become more hairy.
I was suggesting that more norm spreading should be done outwards, keeping it simple and avoiding too much jargon. Is there a presentation of the unilateralist's curse aimed at micro biologists for example?
Also as the the unilaterlist's curse suggests discussing with other people such that they can undertake the information release, sometimes increases the expectation of a bad out come. How should consensus be reached in those situations?
Ah right. I suppose the unilateralist's curse is only a problem insofar as there are a number of other actors also capable of releasing the information; if you are a single actor then the curse doesn't really apply. Although one wrinkle might be considering the unilateralist's curse with regards to different actors through time (i.e., erring on the side of caution with the expectation that other acto... (read more)