Cart (Loading....) | Create Account
Close category search window
 

A new class of ε-optimal learning automata

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Papadimitriou, Georgios I. ; Dept. of Informatics, Aristotle Univ., Thessaloniki, Greece ; Sklira, M. ; Pomportsis, Andreas S.

A new class of P-model absorbing learning automata is introduced. The proposed automata are based on the use of a stochastic estimator in order to achieve a rapid and accurate convergence when operating in stationary random environments. According to the proposed stochastic estimator scheme, the estimates of the reward probabilities of actions are not strictly dependent on the environmental responses. The dependence between the stochastic estimates and the deterministic ones is more relaxed for actions that have been selected only a few times. In this way, actions that have been selected only a few times, have the opportunity to be estimated as "optimal," to increase their choice probability and consequently, to be selected. In this way, the estimates become more reliable and consequently, the automaton rapidly and accurately converges to the optimal action. The asymptotic behavior of the proposed scheme is analyzed and it is proved to be ε-optimal in every stationary random environment. Furthermore, extensive simulation results are presented that indicate that the proposed stochastic estimator scheme converges faster than the deterministic-estimator-based DPRI and DGPA schemes when operating in stationary P-model random environments.

Published in:

Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on  (Volume:34 ,  Issue: 1 )

Date of Publication:

Feb. 2004

Need Help?


IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.