By Topic

Integrating Recognition and Retrieval With Relevance Feedback for Spoken Term Detection

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Hung-yi Lee ; Dept. of Commun. Eng., Nat. Taiwan Univ., Taipei, Taiwan ; Chia-ping Chen ; Lin-shan Lee

Recognition and retrieval are typically viewed as two cascaded independent modules for spoken term detection (STD). Retrieval techniques are assumed to be applied on top of automatic speech recognition (ASR) output, with performance depending on ASR accuracy. We propose a framework that integrates recognition and retrieval and consider them jointly in order to yield better STD performance. This can be achieved either by adjusting the acoustic model parameters (model-based) or by considering detected examples (example-based) using relevance information provided by the user (user relevance feedback) or inferred by the system (pseudo-relevance feedback), either for a given query (short-term context) or by taking into account many previous queries (long-term context). Such relevance feedback approaches have long been used in text information retrieval, but are rarely considered and cannot be directly applied to the retrieval of spoken content. The proposed relevance feedback approaches are specific to spoken content retrieval and are hence very different from those developed for text retrieval, which are applied only to text symbols. We present not only these relevance feedback scenarios and approaches for STD, but also propose a framework to integrate them all together. Preliminary experiments showed significant improvements in each case.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:20 ,  Issue: 7 )
Biometrics Compendium, IEEE