By Topic

Learning automata in games with memory with application to circuit-switched routing

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Alanyali, M. ; Dept. of Electr. & Comput. Eng., Boston Univ., MA, USA

A general setting is considered in which autonomous users interact by means of a finite-state controlled Markov process. This process is driven by the collective actions of all users, and individual users receive separate rewards according to its state. It is assumed that each user chooses its actions via a reinforcement learning algorithm based on its local information. The dynamic behavior of user strategies is characterized for small values of a step-size parameter adopted in learning. The general form of equilibria is obtained and is shown to be analogous to Wardrop equilibria if users update their strategies on a faster time-scale compared to the underlying process. The results are illustrated in the context of routing in circuit-switched communication networks.

Published in:

Decision and Control, 2004. CDC. 43rd IEEE Conference on  (Volume:5 )

Date of Conference:

14-17 Dec. 2004