By Topic

Spoken document retrieval: acoustic variability over the past 100 years

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Hansen, J.H.L. ; Erik Jonsson Sch. of Eng. & Comput. Sci., Texas Univ., Dallas, TX

Summary form only given. The problem of reliable speech recognition for information retrieval is a challenging problem when data is recorded across different media, known/unknown equipment, and different speaking environments. In this talk, we consider problems in audio stream phrase recognition for spoken document retrieval from audio materials spanning the past 110 years. When considering audio transcription for SDR, what should be transcribed? Audio content for broadcast news includes commercials, competing speakers, radio call-in shows, background music, over a wide range of recording conditions. This talk considers the evolution of SDR needed over the past 100 years, with emphasis on acoustics due to speaker, noise, and equipment, while text processing based concepts are considered in the following presentation by Jerome Bellegarda, Apple Corp. Early recordings during the late 1890's and early 1900's were carefully structured and scripted, but employed Edison wax cylinder disk recording formats resulting in reasonable speech structure but poor acoustic recordings. As the cost and ease of recording speeches, debates, and broadcast transmissions evolved, less structured audio content becomes more common with a wider range of equipment. The explosion of audio materials, audio Web portals, audio file-sharing frameworks, makes cataloging and organizing audio content for SDR increasingly important and challenging. Varying audio formats for file sharing, as well as the need to ensure ownership through digital watermarking, introduces a number of issues that can also impact speech recognition performance for SDR. We consider a number of areas and approaches taken for effective SDR, and discuss directions for future information detection schemes for richer information retrieval for the next generation of SDR. Finally, as audio material continues to expand at a rapid pace, automatic transcription support for digital archives and libraries is needed in the future

Published in:

Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on

Date of Conference:

27-27 Nov. 2005