By Topic

Learning Reward Modalities for Human-Robot-Interaction in a Cooperative Training Task

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Anja Austermann ; The Graduate University for Advanced Studies (SOKENDAI), 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430 Japan, e-mail: ; Seiji Yamada

This paper proposes a novel method of learning a users preferred reward modalities for human-robot interaction through solving a cooperative training task. A learning algorithm based on a combination of adaptable pre-trained hidden Markov models and a computational model of classical conditioning is outlined. In a training task, where the desired outcome is known by an AIBO pet robot as well as its human instructor, the robot can freely explore human reward behavior. By this method, the robot is able to learn situated, user-specific reward behavior in the different modalities such as gestures, speech and interaction using the robot's built-in sensors. After the training phase, the learned reward behavior can be used as a basis for reinforcement learning of more complex tasks. A preliminary experimental study is presented, which investigates on the effects of restricting possible reward modalities, when teaching a pet robot. The results of the experiments suggest that being able to provide reward freely makes users give more reward compared to a scenario, where reward modalities are restricted. Moreover, the experiments showed that even if a restriction in possible reward modalities is introduced, users tend to give reward that does not conform to the restriction.

Published in:

RO-MAN 2007 - The 16th IEEE International Symposium on Robot and Human Interactive Communication

Date of Conference:

26-29 Aug. 2007