As a benchmark control problem with nonlinearity and instability, the controller design for inverted pendulums becomes more difficult when there are model uncertainties and unknown disturbances in the plant dynamics. In this paper, a kernel-based reinforcement learning controller is developed for inverted pendulums with unknown dynamics and stochastic disturbances. The learning controller makes use of approximate policy iteration with kernel-based least-squares temporal difference learning for policy evaluation. Due to the nonlinear approximation ability of kernel methods, good convergence property and learning efficiency can be realized in the approximate policy iteration process so that the controller performance can be optimized in a few iterations. Simulation results demonstrate that the proposed learning controller for stochastic inverted pendulums can achieve much better performance than previous learning control approaches such as Q-learning with function approximation and least-squares policy iteration (LSPI).
Published in:
Advanced Intelligent Mechatronics (AIM), 2010 IEEE/ASME International Conference on
Date of Conference: 6-9 July 2010