RG

Ryan Greenblatt

Member of Technical Staff @ Redwood Research
548 karmaJoined

Bio

This other Ryan Greenblatt is my old account[1]. Here is my LW account.

  1. ^

    Account lost to the mists of time and expired university email addresses.

Comments
140

Topic contributions
2

small-c conservative views

Huh, I thought that most of the disagreement between people around these parts and bioethicists is in the direction of people around here being more pro-freedoms of human subjects/patients. (Freedoms aren't exactly the same as protections, but I interpret small-c conservative as being more about freedoms.)

Examples:

  • Right to sell my organs
  • Right to select my kids on the basis of non-medical features
  • Right to access unapproved treatments
  • Right to die if I am of sound mind and wish to do so
  • Right to sign up for arbitrary medical trials/studies, including being compensated and including potentially dangerous medical trials/studies. (Subject to sound mind constraints and maybe extortion constraints.)

Generally, I personally think that much more freedom in medicine would be better.

(In fact, totally free-for-all would plausibly be better than status quo I think though I'm pretty uncertain.)

I agree that there is a disagreement around how utilitarian the medical system should be vs some more fairness based principle.

However, if you go fully in the direction of individual liberties, government involvement in the medical system doesn't matter much. E.g., in a simple system like:

  • Redistribute wealth as desired
  • People can buy whatever health care they want and sign up for whatever clinical trials they want with virtually no government regulation. (Clinical trials require actually informed consent.)

The state doesn't need to make any tradeoffs in health care as it isn't involved. Places like (e.g.) hospitals can do whatever they want with respect to prioritizing care and they could in principle compete etc.

(I'm not claiming that fully in the direction of individual liberties is the right move, e.g. it seems like people are often irrational about health care and hospitals often have monopolies which can cause issues.)

If your institute would like to contribute to this discussion, I would advise you to publish your work in a leading economics journal and to present your work at reputable economics departments and conferences.

I'm aware of various people considering trying to argue with economists about explosive growth (e.g. about the conclusions of this report).

In particular, the probability of explosive growth if you condition on human level machine intelligence. More precisely, something like human level machine intelligence and human level robotic bodies where the machine intelligence requires 10^14 FLOP / human equivalent second (e.g. 1/10 of an H100), can run 5x faster than humans using current hardware, and the robotic bodies cost $20,000 (on today's manufactoring base).

From my understanding they didn't ever end up trying to do this.

Personally, I argued against this being a good use of time:

  • It seems unlikely to me that the economists actually take these ideas seriously and their actual crux is most like "this is crazy, so I reject the premise".
  • It doesn't seem likely that the economists perspective is very enlightening for us (e.g. I don't expect they would have many useful contributions).
  • I don't think it seems that useful to persuade arbitrary economists from a credibility/influence perspective.

So, I think the main question here is a question of whether this is a good use of time.

I think it's probably better to start by trying to talk with economists rather than trying to write a paper.

It is deeply misleading to suggest that accelerating economic growth “has been the norm for most of human history”.

From my understanding of historical growth rate estimates this is wrong. (As in, it is not "deeply misleading".)

Most historical growth rates were far slower than economic growth today. I think you might mean that we have transitioned over time from slower to faster growth modes.

To me, this sounds very similar to "economic growth has accelerated over time". And it sounds like this has happened over a long total period of time.

Maybe you think it has been very discrete with phases (seems unlikely to me as the dominant driver is likely to be population growth and better ability for technological development (e.g. reducing malnutrition)). Or maybe you think that it is key that the change in the rate of growth has historically been slow in sidereal time.

I think literal extinction is unlikely even conditional on misaligned AI takeover due to:

  • The potential for the AI to be at least a tiny bit "kind" (same as humans probably wouldn't kill all aliens).[1]
  • Decision theory/trade reasons

This is discussed in more detail here and here.

Insofar as humans and/or aliens care about nature, similar arguments apply there too, though this is mostly beside the point: if humans survive and have (even a tiny bit of) resources they can preserve some natural easily.

I find it annoying how confident this article is without really bother to engage with the relevant arguments here.

(Same goes for many other posts asserting that AIs will disassemble humans for their atoms.)

(This comment echos Owen's to some extent.)

  1. ^

    This includes the potential for the AI to have preferences that are morally valueable from a typical human perspective.

Ultimately what matters most is what the leadership's views are.

I'm skeptical this is true particularly as AI companies grow massively and require vast amounts of investment.

It does seem important, but unclear it matters most.

One key issue with this model is that I expect that the majority of x-risk from my perspective doesn't correspond to extinction and instead corresponds to some undesirable group unding up with control over the long run future (either AIs seizing control (AI takeover) or undesirable human groups).

So, I would reject:

We can model extinction here by n(t) going to zero.

You might be able to recover things by supposing n(t) gets transformed by some constant multiple on x-risk maybe?

(Further, even if AI takeover does result in extinction there will probably still be some value due to acausal trade and potentially some value due to the AI's preferences.)

(Regardless, I expect that if you think the singularity is plausible, the effects of discounting are more complex because we could very plausibly have >10^20 experience years per year within 5 years of the singularity due to e.g. building a Dyson sphere around the sun. If we just look at AI takeover, ignore (acausal) trade, and assume for simplicity that AI preferences have no value, then it is likely that the vast, vast majority of value is contingent on retaining human control. If we allow for acausal trade, then the discount rates of the AI will also be important to determine how much trade should happen.)

(Separately, pure temporal discounting seems pretty insane and incoherent with my view of the universe works.)

I tried to find out if the time-horizons for potential x-risk events have been explicitly discussed in longtermism literature but I didn’t come across anything.

See here

More specifically, is there any good reason to assume that the odds are in favor of humans even by a little bit? If so, what exactly is the argument for that?

There is a good argument from your perspective: human resource utilization is likely to be more similar to your values on reflection than a randomly chosen other species.

Is there any specific reason for discounting the possibility that arthropods or reptiles evolving over millions of years to something that equals or surpasses the intelligence of humans that were last alive?

No, I think analysis shouldn't discount this. Unless there is an unknown hard-to-pass point (a filter) between existing mammals/primates and human level civilization, it seems like life re-evolving is quite likely. (I'd say 85% chance of a new civilization conditional on human extinction, but not primate extinction, and 75% if primates also go extinct.)

There is also the potential for alien civilizations, though I think this has a lower probability (perhaps 50% that aliens capture >75% of the cosmic resources in our light cone if earth originating civilizations don't caputure these resources).

IMO, the dominant effect of extinction due to bio-risk is that a different earth originating species acquires power and my values on reflection are likely to be closer to humanities values on reflection than the other species. (I also have some influence over how humanity spends its resources, though I expect this effect is not that big.)

If you were equally happy with other species, then I think you still only take a 10x discount from these considerations because there is some possibility of a hard-to-pass barrier between other life and humans. 10x discounts don't usually seem like cruxes IMO.

I would also note that for AI x-risk, life intelligent life reevolving is unimportant. (I also think AI x-risk is unlikely to result in extinction because AIs are unlikely to want to kill all humans for various reasons.)

And over time scales of billions, we could enter the possibility of evolution from basic eukaryotes too. 

Earth will be habitable for about ~1 billion more years which probably isn't quite enough for this.

Perceived counter-argument:

My proposed counter-argument loosely based on the structure of yours.

Summary of claims

  • A reasonable fraction of computational resources will be spent based on the result of careful reflection.
  • I expect to be reasonably aligned with the result of careful reflection from other humans
  • I expect to be much less aligned with result of AIs-that-seize-control reflecting due to less similarity and the potential for AIs to pursue relatively specific objectives from training (things like reward seeking).
  • Many arguments that human resource usage won't be that good seem to apply equally well to AIs and thus aren't differential.

Full argument

The vast majority of value from my perspective on reflection (where my perspective on reflection is probably somewhat utilitarian, but this is somewhat unclear) in the future will come from agents who are trying to optimize explicitly for doing "good" things and are being at least somewhat thoughtful about it, rather than those who incidentally achieve utilitarian objectives. (By "good", I just mean what seems to them to be good.)

At present, the moral views of humanity are a hot mess. However, it seems likely to me that a reasonable fraction of the total computational resources of our lightcone (perhaps 50%) will in expectation be spent based on the result of a process in which an agent or some agents think carefully about what would be best in a pretty delibrate and relatively wise way. This could involve eventually deferring to other smarter/wiser agents or massive amounts of self-enhancement. Let's call this a "reasonably-good-reflection" process.

Why think a reasonable fraction of resources will be spent like this?

  • If you self-enhance and get smarter, this sort of reflection on your values seems very natural. The same for deferring to other smarter entities. Further, entities in control might live for an extremely long time, so if they don't lock in something, as long as they eventually get around to being thoughtful it should be fine.
  • People who don't reflect like this probably won't care much about having vast amounts of resources and thus the resources will go to those who reflect.
  • The argument for "you should be at least somewhat thoughtful about how you spend vast amounts of resources" is pretty compelling at an absolute level and will be more compelling as people get smarter.
  • Currently a variety of moderately powerful groups are pretty sympathetic to this sort of view and the power of these groups will be higher in the singularity.

I expect that I am pretty aligned (on reasonably-good-reflection) with the result of random humans doing reasonably-good-reflection as I am also a human and many of the underlying arguments/intuitions I think seem important seem likely to seem important to many other humans (given various common human intuitions) upon those humans becoming wiser. Further, I really just care about the preferences of (post-)humans who end care most about using vast, vast amounts of computational resources (assuming I end up caring about these things on reflection), because the humans who care about other things won't use most of the resources. Additionally, I care "most" about the on-reflection preferences I have which are relatively less contingent and more common among at least humans for a variety of reasons. (One way to put this is that I care less about worlds in which my preferences on reflection seem highly contingent.)

So, I've claimed that reasonably-good-reflection resource usage will be non-trivial (perhaps 50%) and that I'm pretty aligned with humans on reasonably-good-reflection. Supposing these, why think that most of the value is coming from something like reasonably-good-reflection prefences rather than other things, e.g. not very thoughtful indexical preferences (selfish) consumption? Broadly three reasons:

  • I expect huge returns to heavy optimization of resource usage (similar to spending altruistic resources today IMO and in the future we'll we smarter which will make this effect stronger).
  • I don't think that (even heavily optimized) not-very-thoughtful indexical preferences directly result in things I care that much about relative to things optimized for what I care about on reflection (e.g. it probably doesn't result in vast, vast, vast amounts of experience which is optimized heavily for goodness/$).
    • Consider how billionaries currently spend money which doesn't seem to have have much direct value, certainly not relative to their altruistic expenditures.
    • I find it hard to imagine that indexical self-ish consumption results in things like simulating 10^50 happy minds. See also my other comment. It seems more likely IMO that people with self-ish preferences mostly just buy positional goods that involve little to no experience (separately, I expect this means that people without self-ish preferences get more of the compute, but this is counted in my earlier argument, so we shouldn't double count it.)
  • I expect that indirect value "in the minds of the laborers producing the goods for consumption" is also small relative to things optimized for what I care about on reflection. (It seems pretty small or maybe net-negative (due to factory farming) today (relative to optimized altruism) and I expect the share will go down going forward.)

(Aside: I was talking about not-very-thoughtful indexical-preferences. It's likely to me that doing a reasonably good job reflecting on selfish preferences get back to something like de facto utilitarianism (at least as far as how you spend the vast majority of computational resources) because personal identity and indexical preferences don't make much sense and the thing you end up thinking is more like "I guess I just care about experiences in general".)

What about AIs? I think there are broadly two main reasons to expect that what AIs do on reasonably-good-reflection to be worse from my perspective than what humans do:

  • As discussed above, I am more similar to other humans and when I inspect the object level of how other humans think or act, I feel reasonably optimistic about the results of reasonably-good-reflection for humans. (It seems to me like the main thing holding me back from agreement with other humans is mostly biases/communication/lack of smarts/wisdom given many shared intuitions.) However, AIs might be more different and thus result in less value. Further, the values of humans after reasonably-good-reflection seem close to saturating in goodness from my perspective (perhaps 1/3 or 1/2 of the value of purely my values), so it seems hard for AI to do better.
    • To better understand this argument, imagine that instead of humanity the question was between identical clones of myself and AIs. It's pretty clear I share the same values the clones, so the clones do pretty much strictly better than AIs (up to self-defeating moral views).
    • I'm uncertain about the degree of similarity between myself and other humans. But, mostly the underlying similarity uncertainties also applies to AIs. So, e.g., maybe I currently think on reasonably-good-reflection humans spend resources 1/3 as well as I would and AIs spend resources 1/9 as well. If I updated to think that other humans after reasonably-good-reflection only spend resources 1/10 as well as I do, I might also update to thinking AIs spend resources 1/100 as well.
  • In many of the stories I imagine for AIs seizing control, very powerful AIs end up directly pursuing close correlated of what was reinforced in training (sometimes called reward-seeking, though I'm trying to point at a more general notion). Such AIs are reasonably likely to pursue relatively obviously valueless-from-my-perspective things on reflection. Overall, they might act more like a ultra powerful corporation that just optimizes for power/money rather than our children (see also here). More generally, AIs might in some sense be subjected to wildly higher levels of optimization pressure than humans while being able to better internalize these values (lack of genetic bottleneck) which can plausibly result in "worse" values from my perspective.

Note that we're conditioning on safety/alignment technology failing to retain human control, so we should imagine correspondingly less human control over AI values.

I think that the fraction of computation resources of our lightcone used based on the result of a reasonably-good-reflection process seems similar between human control and AI control (perhaps 50%). It's possible to mess this up of course and either mess up the reflection or to lock-in bad values too early. But, when I look at the balance of arguments, humans messing this up seems pretty similar to AIs messing this up to me. So, the main question is what the result of such a process would be. One way to put this is that I don't expect humans to differ substantially from AIs in terms of how "thoughtful" they are.

I interpret one of your arguments as being "Humans won't be very thoughtful about how they spend vast, vast amounts of computational resources. After all, they aren't thoughtful right now." To the extent I buy this argument, I think it applies roughly equally well to AIs. So naively, it just divides by both sides rather than making AI look more favorable. (At least, if you accept that all most all of the value comes from being at least a bit thoughtful, which you also contest. See my arguments for that.)

Load more