By Topic

Using naïve text queries for robust audio information retrieval

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Samuel Kim ; Signal Anlaysis and Interpretation Lab, (SAIL), University of Southern California, Los Angeles, USA ; Panayiotis Georgiou ; Shrikanth Narayanan ; Shiva Sundaram

The goal of this work is to build an audio information retrieval system which provides users with flexibility in formulating their queries: from audio examples to naïve text. Specifically, the focus of this paper is on using naïve text to create input queries describing the desired information of the users. Using naïve text queries, however, raises interoperability issues between annotation and retrieval processes due to the wide variety of available audio descriptions. In this paper, we propose an intermediate audio description layer (iADL) to solve the interoperability issues between the annotation and retrieval processes. The iADL comprises two axes corresponding to semantic and onomatopoeic descriptions based on human-to-human communication experiments on how humans express sounds verbally. Various text modeling schemes, such as latent semantic analysis (LSA) and latent topic model, are utilized to transform the naïve text onto the proposd iADL.

Published in:

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Date of Conference:

14-19 March 2010