Safe Model-Based Reinforcement Learning With an Uncertainty-Aware Reachability Certificate | IEEE Journals & Magazine | IEEE Xplore

Safe Model-Based Reinforcement Learning With an Uncertainty-Aware Reachability Certificate


Abstract:

Safe reinforcement learning (RL) that solves constraint-satisfactory policies provides a promising way to the broader safety-critical applications of RL in real-world pro...Show More

Abstract:

Safe reinforcement learning (RL) that solves constraint-satisfactory policies provides a promising way to the broader safety-critical applications of RL in real-world problems such as robotics. Among all safe RL approaches, model-based methods reduce training time violations further due to their high sample efficiency. However, lacking safety robustness against the model uncertainties remains an issue in safe model-based RL, especially in training time safety. In this paper, we propose a distributional reachability certificate (DRC) and its Bellman equation to address model uncertainties and characterize robust persistently safe states. Furthermore, we build a safe RL framework to resolve constraints required by the DRC and its corresponding shield policy. We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy. Comprehensive experiments on classical benchmarks such as constrained tracking and navigation indicate that the proposed algorithm achieves comparable returns with much fewer constraint violations during training. Our code is available at https://github.com/ManUtdMoon/Distributional-Reachability-Policy-Optimization. Note to Practitioners—Although it has been proven that RL can be applied in complex robotics control tasks, the training process of an RL control policy induces frequent failures because the agent needs to learn safety through constraint violations. This issue hinders the promotion of RL because a large amount of failure of robots is too expensive to afford. This paper aims to reduce the training-time violations of RL-based control methods, enabling RL to be leveraged in a broader application area. To achieve the goal, we first introduce a safety quantity describing the distribution of potential constraint violations in the long term. By imposing constraints on the quantile of the safety distribution, we can realize safety robust to the model uncertainty, which is necessary...
Published in: IEEE Transactions on Automation Science and Engineering ( Volume: 21, Issue: 3, July 2024)
Page(s): 4129 - 4142
Date of Publication: 27 November 2023

ISSN Information:

Funding Agency:

Author image of Dongjie Yu
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Dongjie Yu received the B.E. and M.S. degrees from Tsinghua University, Beijing, China, in 2020 and 2023, respectively.
His current research interests include safe reinforcement learning and its application in the decision-making, control of robotics, and autonomous driving.
Dongjie Yu received the B.E. and M.S. degrees from Tsinghua University, Beijing, China, in 2020 and 2023, respectively.
His current research interests include safe reinforcement learning and its application in the decision-making, control of robotics, and autonomous driving.View more
Author image of Wenjun Zou
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Wenjun Zou received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2020, where he is currently pursuing the Ph.D. degree in mechanical engineering.
His current research interests include decision and control of autonomous vehicles and reinforcement learning.
Wenjun Zou received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2020, where he is currently pursuing the Ph.D. degree in mechanical engineering.
His current research interests include decision and control of autonomous vehicles and reinforcement learning.View more
Author image of Yujie Yang
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Yujie Yang received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2021, where he is currently pursuing the Ph.D. degree with the School of Vehicle and Mobility.
His research interests include decision and control of autonomous vehicles, reinforcement learning, and optimal control.
Yujie Yang received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2021, where he is currently pursuing the Ph.D. degree with the School of Vehicle and Mobility.
His research interests include decision and control of autonomous vehicles, reinforcement learning, and optimal control.View more
Author image of Haitong Ma
John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
Haitong Ma received the B.S. and M.S. degrees in vehicle engineering from Tsinghua University in 2019 and 2022, respectively. He is currently pursuing the Ph.D. degree with the John A. Paulson School of Engineering and Applied Sciences (SEAS), Harvard University.
His research interest lies in the intersection of control theory and machine learning. He received the Outstanding Master’s Graduate and Master’s Thesis of Tsingh...Show More
Haitong Ma received the B.S. and M.S. degrees in vehicle engineering from Tsinghua University in 2019 and 2022, respectively. He is currently pursuing the Ph.D. degree with the John A. Paulson School of Engineering and Applied Sciences (SEAS), Harvard University.
His research interest lies in the intersection of control theory and machine learning. He received the Outstanding Master’s Graduate and Master’s Thesis of Tsingh...View more
Author image of Shengbo Eben Li
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Shengbo Eben Li (Senior Member, IEEE) received the M.S. and Ph.D. degrees from Tsinghua University in 2006 and 2009, respectively. He was with Stanford University, the University of Michigan, and the University of California at Berkeley. He is currently a tenured Professor with Tsinghua University. He is the author of over 100 journals/conference papers and the co-inventor of over 20 Chinese patents. His research interest...Show More
Shengbo Eben Li (Senior Member, IEEE) received the M.S. and Ph.D. degrees from Tsinghua University in 2006 and 2009, respectively. He was with Stanford University, the University of Michigan, and the University of California at Berkeley. He is currently a tenured Professor with Tsinghua University. He is the author of over 100 journals/conference papers and the co-inventor of over 20 Chinese patents. His research interest...View more
Author image of Yuming Yin
College of Mechanical Engineering, Zhejiang University of Technology, Zhejiang, Hangzhou, China
Yuming Yin received the M.S. degree in vehicle engineering from the University of Science and Technology Beijing in 2013 and the Ph.D. degree in mechanical engineering from Concordia University, Canada, in 2017. He is currently an Associate Professor with the School of Mechanical Engineering, Zhejiang University of Technology; and a Visiting Research Fellow with the School of Vehicle and Mobility, Tsinghua University. He ...Show More
Yuming Yin received the M.S. degree in vehicle engineering from the University of Science and Technology Beijing in 2013 and the Ph.D. degree in mechanical engineering from Concordia University, Canada, in 2017. He is currently an Associate Professor with the School of Mechanical Engineering, Zhejiang University of Technology; and a Visiting Research Fellow with the School of Vehicle and Mobility, Tsinghua University. He ...View more
Author image of Jianyu Chen
Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
Jianyu Chen received the bachelor’s degree from Tsinghua University in 2015 and the Ph.D. degree from the University of California at Berkeley in 2020. He has been an Assistant Professor with the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, since 2020. Prior to that, he was working with Prof. Masayoshi Tomizuka with the University of California at Berkeley. He is working in the cross f...Show More
Jianyu Chen received the bachelor’s degree from Tsinghua University in 2015 and the Ph.D. degree from the University of California at Berkeley in 2020. He has been an Assistant Professor with the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, since 2020. Prior to that, he was working with Prof. Masayoshi Tomizuka with the University of California at Berkeley. He is working in the cross f...View more
Author image of Jingliang Duan
School of Mechanical Engineering, University of Science and Technology Beijing, Beijing, China
Jingliang Duan received the Ph.D. degree from the School of Vehicle and Mobility, Tsinghua University, China, in 2021.
He studied as a Visiting Student Researcher with the Department of Mechanical Engineering, University of California at Berkeley, in 2019; and a Research Fellow with the Department of Electrical and Computer Engineering, National University of Singapore, from 2021 to 2022. He is currently an Associate Profe...Show More
Jingliang Duan received the Ph.D. degree from the School of Vehicle and Mobility, Tsinghua University, China, in 2021.
He studied as a Visiting Student Researcher with the Department of Mechanical Engineering, University of California at Berkeley, in 2019; and a Research Fellow with the Department of Electrical and Computer Engineering, National University of Singapore, from 2021 to 2022. He is currently an Associate Profe...View more

Author image of Dongjie Yu
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Dongjie Yu received the B.E. and M.S. degrees from Tsinghua University, Beijing, China, in 2020 and 2023, respectively.
His current research interests include safe reinforcement learning and its application in the decision-making, control of robotics, and autonomous driving.
Dongjie Yu received the B.E. and M.S. degrees from Tsinghua University, Beijing, China, in 2020 and 2023, respectively.
His current research interests include safe reinforcement learning and its application in the decision-making, control of robotics, and autonomous driving.View more
Author image of Wenjun Zou
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Wenjun Zou received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2020, where he is currently pursuing the Ph.D. degree in mechanical engineering.
His current research interests include decision and control of autonomous vehicles and reinforcement learning.
Wenjun Zou received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2020, where he is currently pursuing the Ph.D. degree in mechanical engineering.
His current research interests include decision and control of autonomous vehicles and reinforcement learning.View more
Author image of Yujie Yang
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Yujie Yang received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2021, where he is currently pursuing the Ph.D. degree with the School of Vehicle and Mobility.
His research interests include decision and control of autonomous vehicles, reinforcement learning, and optimal control.
Yujie Yang received the B.E. degree in automotive engineering from Tsinghua University, Beijing, China, in 2021, where he is currently pursuing the Ph.D. degree with the School of Vehicle and Mobility.
His research interests include decision and control of autonomous vehicles, reinforcement learning, and optimal control.View more
Author image of Haitong Ma
John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, USA
Haitong Ma received the B.S. and M.S. degrees in vehicle engineering from Tsinghua University in 2019 and 2022, respectively. He is currently pursuing the Ph.D. degree with the John A. Paulson School of Engineering and Applied Sciences (SEAS), Harvard University.
His research interest lies in the intersection of control theory and machine learning. He received the Outstanding Master’s Graduate and Master’s Thesis of Tsinghua University, the L4DC Best Paper Award Finalists, the ITSC Best Student Paper Award, and the Championship of Honda Eco Mileage Challenge in China.
Haitong Ma received the B.S. and M.S. degrees in vehicle engineering from Tsinghua University in 2019 and 2022, respectively. He is currently pursuing the Ph.D. degree with the John A. Paulson School of Engineering and Applied Sciences (SEAS), Harvard University.
His research interest lies in the intersection of control theory and machine learning. He received the Outstanding Master’s Graduate and Master’s Thesis of Tsinghua University, the L4DC Best Paper Award Finalists, the ITSC Best Student Paper Award, and the Championship of Honda Eco Mileage Challenge in China.View more
Author image of Shengbo Eben Li
School of Vehicle and Mobility, Tsinghua University, Beijing, China
Shengbo Eben Li (Senior Member, IEEE) received the M.S. and Ph.D. degrees from Tsinghua University in 2006 and 2009, respectively. He was with Stanford University, the University of Michigan, and the University of California at Berkeley. He is currently a tenured Professor with Tsinghua University. He is the author of over 100 journals/conference papers and the co-inventor of over 20 Chinese patents. His research interests include intelligent vehicles and driver assistance, reinforcement learning and distributed control, and optimal control and bbreak estimation.
He was a recipient of the Best Paper Award at IEEE ITS Symposium in 2014, the Best Paper Award in 14th ITS Asia–Pacific Forum, the National Award for Technological Invention in China in 2013, the Excellent Young Scholar of NSF China in 2016, and the Young Professorship of Changjiang Scholar Program in 2016. He serves as an Associate Editor for IEEE Intelligent Transportation Systems Magazine and the IEEE Transactions on Intelligent Transportation Systems.
Shengbo Eben Li (Senior Member, IEEE) received the M.S. and Ph.D. degrees from Tsinghua University in 2006 and 2009, respectively. He was with Stanford University, the University of Michigan, and the University of California at Berkeley. He is currently a tenured Professor with Tsinghua University. He is the author of over 100 journals/conference papers and the co-inventor of over 20 Chinese patents. His research interests include intelligent vehicles and driver assistance, reinforcement learning and distributed control, and optimal control and bbreak estimation.
He was a recipient of the Best Paper Award at IEEE ITS Symposium in 2014, the Best Paper Award in 14th ITS Asia–Pacific Forum, the National Award for Technological Invention in China in 2013, the Excellent Young Scholar of NSF China in 2016, and the Young Professorship of Changjiang Scholar Program in 2016. He serves as an Associate Editor for IEEE Intelligent Transportation Systems Magazine and the IEEE Transactions on Intelligent Transportation Systems.View more
Author image of Yuming Yin
College of Mechanical Engineering, Zhejiang University of Technology, Zhejiang, Hangzhou, China
Yuming Yin received the M.S. degree in vehicle engineering from the University of Science and Technology Beijing in 2013 and the Ph.D. degree in mechanical engineering from Concordia University, Canada, in 2017. He is currently an Associate Professor with the School of Mechanical Engineering, Zhejiang University of Technology; and a Visiting Research Fellow with the School of Vehicle and Mobility, Tsinghua University. He is the author of about 30 peer-reviewed journals/conference papers and a PI/a co-PI of the national and state projects. His active research interests include ground vehicle system dynamics, marginal emergence control, and model-data mixed reinforcement learning.
Yuming Yin received the M.S. degree in vehicle engineering from the University of Science and Technology Beijing in 2013 and the Ph.D. degree in mechanical engineering from Concordia University, Canada, in 2017. He is currently an Associate Professor with the School of Mechanical Engineering, Zhejiang University of Technology; and a Visiting Research Fellow with the School of Vehicle and Mobility, Tsinghua University. He is the author of about 30 peer-reviewed journals/conference papers and a PI/a co-PI of the national and state projects. His active research interests include ground vehicle system dynamics, marginal emergence control, and model-data mixed reinforcement learning.View more
Author image of Jianyu Chen
Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
Jianyu Chen received the bachelor’s degree from Tsinghua University in 2015 and the Ph.D. degree from the University of California at Berkeley in 2020. He has been an Assistant Professor with the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, since 2020. Prior to that, he was working with Prof. Masayoshi Tomizuka with the University of California at Berkeley. He is working in the cross fields of robotics, reinforcement learning, and control and autonomous driving. His research goal is to build advanced robotic systems with high performance and high intelligence.
Jianyu Chen received the bachelor’s degree from Tsinghua University in 2015 and the Ph.D. degree from the University of California at Berkeley in 2020. He has been an Assistant Professor with the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, since 2020. Prior to that, he was working with Prof. Masayoshi Tomizuka with the University of California at Berkeley. He is working in the cross fields of robotics, reinforcement learning, and control and autonomous driving. His research goal is to build advanced robotic systems with high performance and high intelligence.View more
Author image of Jingliang Duan
School of Mechanical Engineering, University of Science and Technology Beijing, Beijing, China
Jingliang Duan received the Ph.D. degree from the School of Vehicle and Mobility, Tsinghua University, China, in 2021.
He studied as a Visiting Student Researcher with the Department of Mechanical Engineering, University of California at Berkeley, in 2019; and a Research Fellow with the Department of Electrical and Computer Engineering, National University of Singapore, from 2021 to 2022. He is currently an Associate Professor with the School of Mechanical Engineering, University of Science and Technology Beijing, China. His research interests include reinforcement learning, optimal control, and self-driving decision-making.
Jingliang Duan received the Ph.D. degree from the School of Vehicle and Mobility, Tsinghua University, China, in 2021.
He studied as a Visiting Student Researcher with the Department of Mechanical Engineering, University of California at Berkeley, in 2019; and a Research Fellow with the Department of Electrical and Computer Engineering, National University of Singapore, from 2021 to 2022. He is currently an Associate Professor with the School of Mechanical Engineering, University of Science and Technology Beijing, China. His research interests include reinforcement learning, optimal control, and self-driving decision-making.View more

Contact IEEE to Subscribe

References

References is not available for this document.