Skip to Main Content
In ordinary reinforcement learning algorithms, a single agent learns to achieve a goal through many episodes. If a learning problem is complicated, it may take much computation time to acquire the optimal policy. Meanwhile, for optimization problems, population-based methods such as particle swarm optimization have been recognized that they are able to find rapidly the global optimal solution for multi-modal functions with wide solution space. We recently proposed reinforcement learning algorithms in which multiple agents are prepared and they learn through not only their respective experiences but also exchanging information among them. In these algorithms, it is important how to design a method of exchanging the information. This paper proposes some methods of exchanging the information based on the update equations of particle swarm optimization. The proposed algorithms using these methods are applied to a shortest path problem, and their performance is compared through numerical experiments.
Date of Conference: 12-15 Oct. 2008