TY - GEN
T1 - Model-based Reinforcement Learning with Provable Safety Guarantees via Control Barrier Functions
AU - Zhang, Hongchao
AU - Li, Zhouchi
AU - Clark, Andrew
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Safety is a critical property in applications including robotics, transportation, and energy. Safety is especially challenging in reinforcement learning (RL) settings, in which uncertainty of the system dynamics may cause safety violations during exploration. Control Barrier Functions (CBFs), which enforce safety by constraining the control actions at each time step, are a promising approach for safety-critical control. This technique has been applied to ensure the safety of model-free RL, however, it has not been integrated into model-based RL. In this paper, we propose Uncertainty-Tolerant Control Barrier Functions (UTCBFs), a new class of CBFs to incorporate model uncertainty and provide provable safety guarantees with desired probability. Furthermore, we introduce an algorithm for model-based RL to guarantee safety by integrating CBFs with gradient-based policy search. Our approach is verified through a numerical study of a cart-pole system and an inverted pendulum system with comparison to state-of-the-art RL algorithms.
AB - Safety is a critical property in applications including robotics, transportation, and energy. Safety is especially challenging in reinforcement learning (RL) settings, in which uncertainty of the system dynamics may cause safety violations during exploration. Control Barrier Functions (CBFs), which enforce safety by constraining the control actions at each time step, are a promising approach for safety-critical control. This technique has been applied to ensure the safety of model-free RL, however, it has not been integrated into model-based RL. In this paper, we propose Uncertainty-Tolerant Control Barrier Functions (UTCBFs), a new class of CBFs to incorporate model uncertainty and provide provable safety guarantees with desired probability. Furthermore, we introduce an algorithm for model-based RL to guarantee safety by integrating CBFs with gradient-based policy search. Our approach is verified through a numerical study of a cart-pole system and an inverted pendulum system with comparison to state-of-the-art RL algorithms.
UR - http://www.scopus.com/inward/record.url?scp=85125504744&partnerID=8YFLogxK
U2 - 10.1109/ICRA48506.2021.9561253
DO - 10.1109/ICRA48506.2021.9561253
M3 - Conference contribution
AN - SCOPUS:85125504744
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 792
EP - 798
BT - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
Y2 - 30 May 2021 through 5 June 2021
ER -