Abstract:
In reinforcement learning, for cost and safety reasons, the policy is usually learned in simulation environments, after which it is applied to the real world. However, th...Show MoreMetadata
Abstract:
In reinforcement learning, for cost and safety reasons, the policy is usually learned in simulation environments, after which it is applied to the real world. However, the learned policy cannot often adapt because real world disturbances and robot failures lead to gaps between the two environments. To narrow such gaps, policies that can adapt to various scenarios are needed. In this study, we propose a reinforcement learning method for acquiring a robust policy against robot failures. In the proposed method, failure is represented by adjusting the physical parameters of the robot. Reinforcement learning under various faults takes place by randomizing the physical parameters during learning. In the experiments, we demonstrate that the robot that learned using the proposed method has higher average rewards than a conventional robot for quadruped walking tasks in a simulation environment with/without robot failures.
Date of Conference: 05-08 December 2020
Date Added to IEEE Xplore: 21 January 2021
ISBN Information: