By Topic

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Grondman, I. ; Delft Center for Syst. & Control, Delft Univ. of Technol., Delft, Netherlands ; Busoniu, L. ; Lopes, G.A.D. ; Babuska, R.

Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control, and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper, therefore, describes the state of the art of actor-critic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After starting with a discussion on the concepts of reinforcement learning and the origins of actor-critic algorithms, this paper describes the workings of the natural gradient, which has made its way into many actor-critic algorithms over the past few years. A review of several standard and natural actor-critic algorithms is given, and the paper concludes with an overview of application areas and a discussion on open issues.

Published in:

Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on  (Volume:42 ,  Issue: 6 )