Learning performance of natural gradient actor-critic algorithms is outstanding especially in high-dimensional spaces than conventional actor-critic algorithms. However, representation issues of stochastic policies or value functions are remaining because the actor-critic approaches need to design it carefully. The author has proposed random rectangular coarse coding, that is very simple and suited for approximating Q-values in high-dimensional state-action space. This paper shows a quantitative analysis of the random coarse coding comparing with regular-grid approaches, and presents a new approach that combines the natural gradient actor-critic with the random rectangular coarse coding.
Published in:
SICE Annual Conference, 2008
Date of Conference: 20-22 Aug. 2008