A novel learning policy in multiagent reinforcement learning is presented, trying to find another tradeoff of exploration and exploitation efficiently, which is different from traditional greedy or softmax action selection method. The state and action of multiagent are represented with quantum superposition state, and probability amplitude is used to denote the probability of an action. Quantum search algorithm is adopted in multiagent action selection. The experiment results show that the new algorithm is effective and can help multiagent learn faster. This combination of quantum computing with multiagent reinforcement learning is an attempt, and the idea possibly brings more researches in multiagent reinforcement learning
Published in:
Intelligent Control and Automation, 2006. WCICA 2006. The Sixth World Congress on
(Volume:1
)
Date of Conference: 0-0 0