By Topic

Interactive Spoken Document Retrieval With Suggested Key Terms Ranked by a Markov Decision Process

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Yi-Cheng Pan ; MediaTek, Inc., Hsinchu, Taiwan ; Hung-yi Lee ; Lin-shan Lee

Interaction with users is a powerful strategy that potentially yields better information retrieval for all types of media, including text, images, and videos. While spoken document retrieval (SDR) is a crucial technology for multimedia access in the network era, it is also more challenging than text information retrieval because of the inevitable recognition errors. It is therefore reasonable to consider interactive functionalities for SDR systems. We propose an interactive SDR approach in which given the user's query, the system returns not only the retrieval results but also a short list of key terms describing distinct topics. The user selects these key terms to expand the query if the retrieval results are not satisfactory. The entire retrieval process is organized around a hierarchy of key terms that define the allowable state transitions; this is modeled by a Markov decision process, which is popularly used in spoken dialogue systems. By reinforcement learning with simulated users, the key terms on the short list are properly ranked such that the retrieval success rate is maximized while the number of interactive steps is minimized. Significant improvements over existing approaches were observed in preliminary experiments performed on information needs provided by real users. A prototype system was also implemented.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:20 ,  Issue: 2 )