By Topic

Subword Modeling for Automatic Speech Recognition: Past, Present, and Emerging Approaches

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Karen Livescu ; TTI-Chicago, Chicago, Illinois 60637 USA ; Eric Fosler-Lussier ; Florian Metze

Modern automatic speech recognition systems handle large vocabularies of words, making it infeasible to collect enough repetitions of each word to train individual word models. Instead, large-vocabulary recognizers represent each word in terms of subword units. Typically the subword unit is the phone, a basic speech sound such as a single consonant or vowel. Each word is then represented as a sequence, or several alternative sequences, of phones specified in a pronunciation dictionary. Other choices of subword units have been studied as well. The choice of subword units, and the way in which the recognizer represents words in terms of combinations of those units, is the problem of subword modeling. Different subword models may be preferable in different settings, such as high-variability conversational speech, high-noise conditions, low-resource settings, or multilingual speech recognition. This article reviews past, present, and emerging approaches to subword modeling. To make clean comparisons between many approaches, the review uses the unifying language of graphical models.

Published in:

IEEE Signal Processing Magazine  (Volume:29 ,  Issue: 6 )