Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Learning of soccer player agents using a policy gradient method: Coordination between kicker and receiver during free kicks

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Igarashi, H. ; Dept. of Inf. Sci. & Eng., Sibaura Inst. of Technol., Tokyo ; Nakamura, K. ; Ishihara, S.

The RoboCup Simulation League is recognized as a test bed for research on multi-agent learning. As an example of multi-agent learning in a soccer game, we dealt with a learning problem between a kicker and a receiver when a direct free kick is awarded just outside the opponentpsilas penalty area. In such a situation, to which point should the kicker kick the ball? We propose a function that expresses heuristics to evaluate an advantageous target point for safely sending/receiving a pass and scoring. The heuristics includes an interaction term between a kicker and a receiver to intensify their coordination. To calculate the interaction term, we let kicker/receiver agents have a receiver/kicker action decision model to predict his teammatepsilas action. The evaluation function makes it possible to handle a large space of states consisting of the positions of a kicker, a receiver, and their opponents. The target point of the free kick is selected by the kicker using Boltzmann selection with an evaluation function. Parameters in the function can be learned by a kind of reinforcement learning called the policy gradient method. The point to which a receiver should run to receive the ball is simultaneously learned in the same manner. The effectiveness of our solution was shown by experiments.

Published in:

Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on

Date of Conference:

1-8 June 2008