26

LanceSBush comments on Cognitive Science/Psychology As a Neglected Approach to AI Safety - Effective Altruism Forum

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (36)

You are viewing a single comment's thread. Show more comments above.

Comment author: Kaj_Sotala 12 June 2017 06:57:32AM *  0 points [-]

My main objection is that that even if we pursue this project, it does not achieve the heavy metaethical lifting you were alluding to earlier. It doesn’t demonstrate nor provide any particularly good reason to regard the outputs of this process as moral truth.

Well, what alternative would you propose? I don't see how it would even be possible to get any stronger evidence for the moral truth of a theory, than the failure of everyone to come up with convincing objections to it even after extended investigation. Nor a strategy for testing the truth which wouldn't at some point reduce to "test X gives us reason to disagree with the theory".

I would understand your disagreement if you were a moral antirealist, but your comments seem to imply that you do believe that a moral truth exists and that it's possible to get information about it, and that it's possible to do "heavy metaethical lifting". But how?

I want to convert all matter in the universe to utilitronium.

I think anything as specific as this sounds worryingly close to wanting an AI to implement favoritepoliticalsystem.

What the first communist revolutionaries thought would happen, as the empirical consequence of their revolution, was that people’s lives would improve: laborers would no longer work long hours at backbreaking labor and make little money from it. This turned out not to be the case, to put it mildly. But what the first communists thought would happen, was not so very different from what advocates of other political systems thought would be the empirical consequence of their favorite political systems. They thought people would be happy. They were wrong.

Now imagine that someone should attempt to program a “Friendly” AI to implement communism, or libertarianism, or anarcho-feudalism, or favoritepoliticalsystem, believing that this shall bring about utopia. People’s favorite political systems inspire blazing suns of positive affect, so the proposal will sound like a really good idea to the proposer.

We could view the programmer’s failure on a moral or ethical level—say that it is the result of someone trusting themselves too highly, failing to take into account their own fallibility, refusing to consider the possibility that communism might be mistaken after all. But in the language of Bayesian decision theory, there’s a complementary technical view of the problem. From the perspective of decision theory, the choice for communism stems from combining an empirical belief with a value judgment. The empirical belief is that communism, when implemented, results in a specific outcome or class of outcomes: people will be happier, work fewer hours, and possess greater material wealth. This is ultimately an empirical prediction; even the part about happiness is a real property of brain states, though hard to measure. If you implement communism, either this outcome eventuates or it does not. The value judgment is that this outcome satisfices or is preferable to current conditions. Given a different empirical belief about the actual realworld consequences of a communist system, the decision may undergo a corresponding change.

We would expect a true AI, an Artificial General Intelligence, to be capable of changing its empirical beliefs (or its probabilistic world-model, et cetera). If somehow Charles Babbage had lived before Nicolaus Copernicus, and somehow computers had been invented before telescopes, and somehow the programmers of that day and age successfully created an Artificial General Intelligence, it would not follow that the AI would believe forever after that the Sun orbited the Earth. The AI might transcend the factual error of its programmers, provided that the programmers understood inference rather better than they understood astronomy. To build an AI that discovers the orbits of the planets, the programmers need not know the math of Newtonian mechanics, only the math of Bayesian probability theory.

The folly of programming an AI to implement communism, or any other political system, is that you’re programming means instead of ends. You’re programming in a fixed decision, without that decision being re-evaluable after acquiring improved empirical knowledge about the results of communism. You are giving the AI a fixed decision without telling the AI how to re-evaluate, at a higher level of intelligence, the fallible process which produced that decision.

Comment author: LanceSBush 12 June 2017 02:54:30PM 1 point [-]

Whoops. I can see how my responses didn't make my own position clear.

I am an anti-realist, and I think the prospects for identifying anything like moral truth are very low. I favor abandoning attempts to frame discussions of AI or pretty much anything else in terms of converging on or identifying moral truth.

I consider it a likely futile effort to integrate important and substantive discussions into contemporary moral philosophy. If engaging with moral philosophy introduces unproductive digressions/confusions/misplaced priorities into the discussion it may do more harm than good.

I'm puzzled by this remark:

I think anything as specific as this sounds worryingly close to wanting an AI to implement favoritepoliticalsystem.

I view utilitronium as an end, not a means. It is a logical consequence of wanting to maximize aggregate utility and is more or less a logical entailment of my moral views. I favor the production of whatever physical state of affairs yields the highest aggregate utility. This is, by definition, "utilitronium." If I'm using the term in an unusual way I'm happy to propose a new label that conveys what I have in mind.

Comment author: Kaj_Sotala 16 June 2017 02:41:54PM *  1 point [-]

I am an anti-realist, and I think the prospects for identifying anything like moral truth are very low. I favor abandoning attempts to frame discussions of AI or pretty much anything else in terms of converging on or identifying moral truth.

Ah, okay. Well, in that case you can just read my original comment as an argument for why one would want to use psychology to design an AI that was capable of correctly figuring out just a single person's values and implementing them, as that's obviously a prerequisite for figuring out everybody's values. The stuff that I had about social consensus was just an argument aimed at moral realists, if you're not one then it's probably not relevant for you.

(my values would still say that we should try to take everyone's values into account, but that disagreement is distinct from the whole "is psychology useful for value learning" question)

I'm puzzled by this remark:

Sorry, my mistake - I confused utilitronium with hedonium.