Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games | IEEE Conference Publication | IEEE Xplore

Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games


Abstract:

Many multiagent Q-learning algorithms have been proposed to date, and most of them aim to converge to a Nash equilibrium, which is not desirable in games like the Prisone...Show More

Abstract:

Many multiagent Q-learning algorithms have been proposed to date, and most of them aim to converge to a Nash equilibrium, which is not desirable in games like the Prisoner's Dilemma (PD). In the previous paper, the author proposed the utility-based Q-learning for PD, which used utilities as rewards in order to maintain mutual cooperation once it had occurred. However, since the agent's action depends on the relation of Q-values the agent has, the mutual cooperation can be maintained by adjusting the learning rate of Q-learning. Thus, in this paper, we deal with the learning rate directly and introduce a new Q-learning method called the learning-rate adjusting Q-learning, or LRA-Q.
Date of Conference: 09-12 December 2008
Date Added to IEEE Xplore: 06 January 2009
Print ISBN:978-0-7695-3496-1
Conference Location: Sydney, NSW, Australia

Contact IEEE to Subscribe

References

References is not available for this document.