OCB

Owen Cotton-Barratt

8862 karmaJoined Aug 2014

Sequences
3

Reflection as a strategic goal
On Wholesomeness
Everyday Longermism

Comments
779

Topic contributions
3

Habryka identifies himself as the author of a different post which is linked to and being discussed in a different comment thread.

Yeah it totally has the same effect. It can just be less natural to analyse, if you think the risk will (or might) decrease a lot following some transition (which is also when the risk will mostly be incurred), but you're less confident about when the transition will occur.

I'm worried we're talking past each other here. We totally might find arrangements that keep the state risk at like 1% -- and in that case then (as Thorstad points out) we expect not to have a very large future (though it could still be decently large compared to the world today).

But if your axiology is (in part) totalist, you'll care a lot whether we actually get to very large futures. I'm saying (agreeing with Thorstad) that these are dependent on finding some arrangement which drives risk very low. Then I'm saying (disagreeing with Thorstad?) that the decision-relevant question is more like "have we got any chance of getting to such a state?" rather than "are we likely to reach such a state?"

Ok, I agree with you that state risk is also an important part of the picture. I basically agree that nuclear risk is better understood as a state risk. I think the majority of AI risk is better understood as a transition risk, which was why I was emphasising that.

I guess at a very high level, I think: either there are accessible arrangements for society at some level of technological advancement which drive risk very low, or there aren't. If there aren't, it's very unlikely that the future will be very large. If there are, then there's a question of whether the world can reach such a state before an existential catastrophe. If risk now is lower than risk we're likely to incur on the path to such an arrangement, it can be thought of as a transition risk (whether we manage to bear the increased exposure on the way) ... by analogy, maybe there's a part of putting up the sail where you're exposed to being washed overboard by a freak wave, which can be thought of as a state risk which forms part of a transition risk.

If there are accessible arrangements, even if we can't identify them now, I expect some significant effort to go into searching and steering for them, so a nontrivial chance of reaching one. An argument that we won't reach such a state seems like it's either going to need to argue that there are no such states (seems unlikely to me; I think my intuition is informed in part by the existence of error correcting codes), or that it's vanishingly unlikely that we could reach one that does exist (doesn't seem impossible to me but I find it hard to see how we could hope to get confidence on this point).

(With apologies, I think this comment is kind of dense. Some better version of it would give the arguments more cleanly.)

Let me be clear about the type signature of the sail metaphor: it's not giving an object-level argument that the risk will drop a long way. I think it's a completely legit question why this one is different. (I'm not confident that it is, but the kind of reason I think it may well be are outlined in this post.)

Instead it's saying that it may be more natural to have the object-level conversations about transitions rather than about risk-per-century. Here's a stylized example:

  • Suppose you're confident that putting up the sail will incur a 50% risk, and otherwise risk is essentially zero
  • Suppose further that you don't know at all when the sail attempt will be made
    • (yeah, I'm mixing my metaphors here by keeping us on the boat for many centuries)
    • You decide to use Laplace's law of succession on centuries, starting 1 century ago
    • So ex ante there was a 1/2 chance of it happening in the last century; but that now hasn't happened, so there's a 1/3 chance of it happening in the next century. If we wait N more centuries without it happening, then the probability of it happening over the following century (i.e. conditional on it not having happened yet) is 1/(3+N)
  • Then your risk of falling in is 16% over the next century, and decreasing smoothly with time, but still 0.01% absolute risk (i.e. that's not even conditional on surviving that long) 100 centuries out

In this example you're certain there's a time-of-perils dynamic going on, and that you have a 50% chance of an indefinitely long future without falling in. But it's hard to argue for any particular century by which risk is very low ... even the estimates in my spreadsheet don't provide bounds on risk, because you weren't at all confident in the per-century estimates of when the sail attempt would be made. 

My claim is that in cases roughly like this it can be more illuminating to think and argue about the risk-per-transition than the risk-per-century. (Of course, if you think that most risk is state risk rather than transition risk that's also worth discussion.)

The main point of my comment above is that "highly uncertain" is enough to support action premised on the possibility of a time of perils.

For what it's worth I think that the ontology of "dropping risk by many orders of magnitude" is putting somewhat too much emphasis on "risk per century" as a natural unit. I think a lot of anthropogenic risk is best understood not as a state risk (think "risk I randomly fall off the side of the boat"), but as a transition risk (think "risk I fall in as I try to put the sail up"). Some of the high risk imagined this century is from the possibility that we rush putting the sail up. We may not rush it! So my ex ante risk doesn't diminish super steeply over centuries as I don't know in which one the sail attempt will be made. But (in this metaphor) we only need to put the sail up once, and it would seem confused to argue that risk will stay high ~forever just because we don't know when we'll make the attempt.

From my perspective, therefore, the value of this work is that it justifies that it would be importantly decision-relevant to find strong arguments that we're not in a time of perils situation. That's not hugely surprising, but it's good to get the increased confidence and to have a handle on precisely how it would be decision-relevant.

This work hinges on the assumption that we're not in a time of perils situation. In other work Thorstad argues that the common arguments for thinking we're in a time of perils are uncompelling. I'm not sure I agree (i.e. on balance my inside view supports a time of perils, but I'm not sure that the case for this has ever been spelled out in a watertight way), but fair enough -- it's very healthy and good to poke at foundational assumptions. But he doesn't provide any strong arguments that we aren't in a time of perils. And the arguments presented here rely in important ways on certainty (rather than just likelihood) in the assumption that we're not in a time of perils -- the far-distant future should be discounted at its lowest possible rate, and that applies just as much to discounting for hazard rate (chance that we will go extinct) as to any other kind of discounting.

I think you're right to be more uncomfortable with the counterfactual analysis in cases where you're aligned with the other players in the game. Cribbing from a comment I've made on this topic before on the forum:

I think that counterfactual analysis is the right approach to take on the first point if/when you have full information about what's going on. But in practice you essentially never have proper information on what everyone else's counterfactuals would look like according to different actions you could take.

If everyone thinks in terms of something like "approximate shares of moral credit", then this can help in coordinating to avoid situations where a lot of people work on a project because it seems worth it on marginal impact, but it would have been better if they'd all done something different. Doing this properly might mean impact markets (where the "market" part works as a mechanism for distributing cognition, so that each market participant is responsible for thinking through their own alternative options, and feeding that information into the system via their willingness to do work for different amounts of pay), but I think that you can get some rough approximation to the benefits of impact markets without actual markets by having people do the things they would have done with markets -- and in this context, that means paying attention to the share of credit different parties would get.

Shapley values are one way to divide up that credit. They have some theoretical appeal, but it's basically as "what would a fair division of credit be, which divides the surplus compared to outside options". And they're extremely complex to calculate so in practice I'd recommend against even trying. Instead just think of it as an approximate bargaining solution between the parties, and use some other approximation to bargaining solutions -- I think Austin's practice of looking to the business world for guidance is a reasonable approach here.

(If there's nobody whom you're plausibly coordinating with then I think trying to do a rough counterfactual analysis is reasonable, but that doesn't feel true of any of your examples.)

I'll give general takes in another comment, but I just wanted to call out how I think that at least for some of your examples the assumptions are unrealistic (and this can make the puzzle sound worse than it is).

Take the case of "The funding of an organization and the people working at the org". In this case the must factors combine in a sub-multiplicative way rather than a multiplicative way. For it's clear that if you double the funding and double the people working at the org you should approximately double the output (rather than quadruple it). I think that Cobb-Douglas production functions are often a useful modelling tool here.

In the case of managers or the Forum I suspect that it's also not quite multiplicative -- but a bit closer to it. In any case I do think that after accounting for this there's still a puzzle about how to evaluate it.

Load more