Real-time reinforcement learning is difficult because number of episodes is too much to complete learning within limited time in practice. On the other hand, in spite of trial-and-error learning, animals can complete learning within limited time. Conventional framework cannot explain it. In this paper, we address the pursuit problem using optical information tau and information of direction that is physical property. We demonstrated fugitive robot could learn policy to free from predator robot in small number of episodes.
Published in:
Soft Computing in Industrial Applications, 2008. SMCia '08. IEEE Conference on
Date of Conference: 25-27 June 2008