By Topic

Active and unsupervised learning for spoken word acquisition through a multimodal interface

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
N. Iwahashi ; ATR Spoken Language Translation Lab., Kyoto, Japan

This work presents a new interactive learning method for spoken word acquisition through human-machine multimodal interfaces. During the course of learning, a machine makes a decision about whether an orally input word is a word in the lexicon the machine has learned using both speech and visual cues. Learning is carried out on-line, incrementally, based on a combination of active and unsupervised learning principles. If the machine judges with a high degree of confidence that its decision is correct, it learns the statistical models of the word and a corresponding image class as its meaning in an unsupervised way. Otherwise, it asks the user a question in an active way. The function used to estimate the degree of confidence is also learned adaptively on-line. Experimental results show that the method enables a machine and a user to adapt to each other, which makes the learning process more efficient.

Published in:

Robot and Human Interactive Communication, 2004. ROMAN 2004. 13th IEEE International Workshop on

Date of Conference:

20-22 Sept. 2004