Kaj_Sotala comments on My current thoughts on MIRI's "highly reliable agent design" work - Effective Altruism Forum

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (57)

You are viewing a single comment's thread.

Comment author: Kaj_Sotala 08 July 2017 11:32:13PM 11 points [-]

I haven't found any instances of complete axiomatic descriptions of AI systems being used to mitigate problems in those systems (e.g. to predict, postdict, explain, or fix them) or to design those systems in a way that avoids problems they'd otherwise face. [...] It seems plausible that the kinds of axiomatic descriptions that HRAD work could produce would be too taxing to be usefully applied to any practical AI system.

I wonder if slightly analogous example could be found in the design of concurrent systems.

As you may know, it's surprisingly difficult to design software that has multiple concurrent processes manipulating the same data. You typically either screw up by letting the processes edit the same data at the same time or in the wrong order, or by having them wait for each other forever.

So to help reason more clearly about this kind of thing, people developed different forms of temporal logic that let them express in a maximally unambiguous form different desiderata that they have for the system. Temporal logic lets you express statements that say things like "if a process wants to have access to some resource, it will eventually enter a state where it has access to that resource". You can then use temporal logic to figure out how exactly you want your system to behave, in order for it to do the things you want it to do and not run into any problems.

Building a logical model of how you want your system to behave is not the same thing as building the system. The logic only addresses one set of desiderata: there are many others it doesn't address at all, like what you want the UI to be like and how to make the system efficient in terms of memory and processor use. It's a model that you can use for a specific subset of your constraints, both for checking whether the finished system meets those constraints, and for building a system so that it's maximally easy for it to meet those constraints. Although the model is not a whole solution, having the model at hand before you start writing all the concurrency code is going to make things a lot easier for you than if you didn't have any clear idea of how you wanted the concurrent parts to work and were just winging it as you went.

So similarly, if MIRI developed HRAD into a sufficiently sophisticated form, it might yield a set of formal desiderata of how we want the AI to function, as well as an axiomatic model that can be applied to a part of the AI's design, to make sure everything goes as intended. But I would guess that it wouldn't really be a "complete axiomatic descriptions of" the system, in the way that temporal logics aren't a complete axiomatic description of modern concurrent systems.

Comment author: Daniel_Dewey 10 July 2017 07:28:46PM 3 points [-]

Thanks for this suggestion, Kaj -- I think it's an interesting comparison!