In this paper, we describe a novel method of word acquisition through multimodal interaction between a humanoid robot and humans. The developed robot realizes word, actually verb, acquisition from raw multimodal sensory stimulus by seeing movement of the given objects and listening to spoken utterance by humans without symbolic representations of semantics. In addition, the robot can utter the learnt words base on its own phonemes which correspond to the categorical phonetic feature map. We consider that words bind directly to non-symbolic perceptual physical feature: such as visual features of the given object and acoustic features of given utterance, aside from symbolic representations of semantics.
Published in:
Robotics and Biomimetics (ROBIO), 2009 IEEE International Conference on
Date of Conference: 19-23 Dec. 2009