Sample Complexity of Model-Based Robust Reinforcement Learning | IEEE Conference Publication | IEEE Xplore

Sample Complexity of Model-Based Robust Reinforcement Learning


Abstract:

We consider the problem of learning the optimal robust value function and the optimal robust policy in discounted-reward Robust Markov Decision Process (RMDP). The goal o...Show More

Abstract:

We consider the problem of learning the optimal robust value function and the optimal robust policy in discounted-reward Robust Markov Decision Process (RMDP). The goal of the RMDP framework is to find a policy that is robust to the parameter uncertainties due to the mismatch between the simulator model and real-world settings. While the optimal robust value function and policy can be computed using robust dynamic programming, it requires the exact knowledge of the nominal simulator model and the uncertainty set around it. This paper proposes a model-based robust reinforcement learning algorithm that learns an -optimal robust value function and policy in a finite state and action space setting when the exact knowledge of the nominal simulator model is not known. We assume access to a standard generative sampling model, which can generate next-state samples for all state-action pairs of the nominal simulator model. We give a precise characterization of the sample complexity of obtaining an ϵ-optimal robust value function and policy using our algorithm. Finally, we demonstrate the performance of our algorithm on some benchmark problems.
Date of Conference: 14-17 December 2021
Date Added to IEEE Xplore: 01 February 2022
ISBN Information:

ISSN Information:

Conference Location: Austin, TX, USA

Contact IEEE to Subscribe

References

References is not available for this document.