Skip to Main Content
Robots controlled by Reinforcement Learning (RL) are still rare. A core challenge to the application of RL to robotic systems is to learn despite the existence of control delay - the delay between measuring a system's state and acting upon it. Control delay is always present in real systems. In this work, we present two novel temporal difference (TD) learning algorithms for problems with control delay. These algorithms improve learning performance by taking the control delay into account. We test our algorithms in a gridworld, where the delay is an integer multiple of the time step, as well as in the simulation of a robotic system, where the delay can have any value. In both tests, our proposed algorithms outperform classical TD learning algorithms, while maintaining low computational complexity.
Date of Conference: 18-22 Oct. 2010