A formal methods approach to interpretability, safety and composability for reinforcement learning
MetadataShow full item record
Robotic systems that are capable of learning from experience have recently become more common place. These systems have demonstrated success in learning difficult control tasks. However, as tasks become more complex and the number of options to reason about becomes greater, there is an increasing need to be able to specify the desired behavior in a structured and interpretable fashion, guarantee system safety, conveniently integrate task specific knowledge with more general knowledge about the world and generate new skills from learned ones without additional exploration. This thesis addresses these problems specifically in the case of reinforcement learning (RL) by using techniques from formal methods. Experience and prior knowledge shape the way humans make decisions when asked to perform complex tasks. Conversely, robots have had difficulty incorporating a rich set of prior knowledge when solving complex planning and control problems. In RL, the reward offers an avenue for incorporating prior knowledge. However, incorporating such knowledge is not always straightforward using standard reward engineering techniques. This thesis presents a formal specification language that can combine a base of general knowledge with task specifications to generate richer task descriptions. For example, to make a hotdog at the task level, one needs to grab a sausage, grill it, place the cooked sausage in a bun, apply ketchup, and serve. Prior knowledge about the context of the task, e.g., sausages can be damaged if squeezed too hard, should also be taken into account. Interpretability in RL rewards - easily understanding what the reward function represents and knowing how to improve it - is a key component in understanding the behavior of an RL agent. This property is often missing in reward engineering techniques, which makes it difficult to understand exactly what the implications of the reward function are when tasks become complex. Interpretability of the reward allows for better value alignment between human intent and system objectives, leading to a lower likelihood of reward hacking by the system. The formal specification language presented in this work has the added benefit of being easily interpretable for its similarity with natural language. Safe RL - guaranteeing undesirable behaviors do not occur (i.e. collisions with obstacles), is a critical concern when learning and deployment of robotic systems happen in the real world. Safety for these systems not only presents legal challenges to their wide adoption, but also raises risks to hardware and users. By using techniques from formal methods and control theory, we provide two main components to ensure safety in the RL agent behaviors. First, the formal specification language allows for explicit definition of undesirable behaviors (e.g. always avoid collisions). Second, control barrier functions (CBF) are used to enforce these safety constraints. Composability of learned skills - the ability to compose new skills from a library of learned ones can significantly enhance a robot's capabilities by making efficient use of past experience. Modern RL systems focus mainly on mastery (maximizing the given reward) and less on generalization (transfer from one task domain to another). In this thesis, we will also exploit the logical and graphical representations of the task specification and develop techniques for skill composition.