Skip to Main Content
The RoboCup Simulation League is recognized as a test bed for research on multi-agent learning. As an example of multi-agent learning in a soccer game, we dealt with a learning problem between a kicker and a receiver when a direct free kick is awarded just outside the opponentpsilas penalty area. In such a situation, to which point should the kicker kick the ball? We propose a function that expresses heuristics to evaluate an advantageous target point for safely sending/receiving a pass and scoring. The heuristics includes an interaction term between a kicker and a receiver to intensify their coordination. To calculate the interaction term, we let kicker/receiver agents have a receiver/kicker action decision model to predict his teammatepsilas action. The evaluation function makes it possible to handle a large space of states consisting of the positions of a kicker, a receiver, and their opponents. The target point of the free kick is selected by the kicker using Boltzmann selection with an evaluation function. Parameters in the function can be learned by a kind of reinforcement learning called the policy gradient method. The point to which a receiver should run to receive the ball is simultaneously learned in the same manner. The effectiveness of our solution was shown by experiments.