By Topic

Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Vamvoudakis, K. ; Autom. & Robot. Res. Inst., Univ. of Texas at Arlington, Fort Worth, TX ; Vrabie, D. ; Lewis, F.

In this paper we discuss two online algorithms based on policy iterations for learning the continuous-time (CT) optimal control solution when nonlinear systems with infinite horizon quadratic cost are considered. For the first time we present an online adaptive algorithm implemented on an actor/critic structure which involves synchronous continuous-time adaptation of both actor and critic neural networks. This is a version of generalized policy iteration for CT systems. The convergence to the optimal controller based on the novel algorithm is proven while stability of the system is guaranteed. The characteristics and requirements of the new online learning algorithm are discussed in relation with the regular online policy iteration algorithm for CT systems which we have previously developed. The latter solves the optimal control problem by performing sequential updates on the actor and critic networks, i.e. while one is learning the other one is held constant. In contrast, the new algorithm relies on simultaneous adaptation of both actor and critic networks. To support the new theoretical result a simulation example is then considered.

Published in:

Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on

Date of Conference:

March 30 2009-April 2 2009