By Topic

ϵ-optimal discretized linear reward-penalty learning automata

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
B. J. Oommen ; Sch. of Comput. Sci., Carleton Univ., Ottawa, Ont., Canada ; J. P. R. Christensen

Variable-structure stochastic automata (VSSA) are considered which interact with an environment and which dynamically learn the optimal action that the environment offers. Like all VSSA the automata are fully defined by a set of action-probability updating rules. However, to minimize the requirements on the random-number generator used to implement the VSSA, and to increase the speed of convergence of the automation, the case in which the probability-updating functions can assume only a finite number of values. These values discretize the probability space [0, 1] and hence they are called discretized learning automata. The discretized automata are linear because the subintervals of [0, 1] are of equal length. The authors prove the following results: (a) two-action discretized linear reward-penalty automata are ergodic and ε-optimal in all environments whose minimum penalty probability is less than 0.5; (b) there exist discretized two-action linear reward-penalty automata that are ergodic and ε-optimal in all random environments, and (c) discretized two-action linear reward-penalty automata with artificially created absorbing barriers are ε-optimal in all random environments

Published in:

IEEE Transactions on Systems, Man, and Cybernetics  (Volume:18 ,  Issue: 3 )