Reinforcement learning with temporal logic rewards
Files
Accepted manuscript
Date
2017-01-01
DOI
Authors
Li, Xiao
Vasile, Cristian-Ioan
Belta, Calin
Version
OA Version
Citation
Xiao Li, Cristian-Ioan Vasile, Calin Belta. 2017. "Reinforcement Learning With Temporal Logic Rewards." 2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), pp. 3834 - 3839 (6).
Abstract
Reinforcement learning (RL) depends critically on
the choice of reward functions used to capture the desired behavior
and constraints of a robot. Usually, these are handcrafted
by a expert designer and represent heuristics for relatively
simple tasks. Real world applications typically involve more
complex tasks with rich temporal and logical structure. In this
paper we take advantage of the expressive power of temporal
logic (TL) to specify complex rules the robot should follow,
and incorporate domain knowledge into learning. We propose
Truncated Linear Temporal Logic (TLTL) as a specification
language,We propose Truncated Linear Temporal Logic (TLTL)
as a specification language,that is arguably well suited for the
robotics applications, We show in simulated trials that learning
is faster and policies obtained using the proposed approach
outperform the ones learned using heuristic rewards in terms
of the robustness degree, i.e., how well the tasks are satisfied.
Furthermore, we demonstrate the proposed RL approach in a
toast-placing task learned by a Baxter robot.