By Topic

Toward an optimized value iteration algorithm for average cost Markov decision processes

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Arruda, E.F. ; Sch. of Eng., Pontifical Catholic Univ. of Rio Grande do Sul, Porto Alegre, Brazil ; Ourique, F. ; Almudevar, A.

In this paper we propose a technique to accelerate the convergence rate of the value iteration (VI) algorithm applied to discrete average cost Markov decision processes (MDP). The convergence rate is measured with respect to the total computational effort instead of the iteration counter. Such a rate definition makes it possible to compare different classes of algorithms, which employ distinct and possibly variable updating schemes. A partial information value iteration (PIVI) algorithm is proposed that updates an increasingly accurate approximate version of the original problem with a view toward saving computations at the early stages of the algorithm, when one is typically far from the optimal solution. The PIVI overall computational effort is compared with that of the classical VI algorithm for a broad set of parameters. The results suggest that a suitable choice of parameters can lead to significant computational savings in the process of finding the optimal solution for discrete MDP under the average cost criterion.

Published in:

Decision and Control (CDC), 2010 49th IEEE Conference on

Date of Conference:

15-17 Dec. 2010