Cart (Loading....) | Create Account
Close category search window

WFST Enabled Solutions to ASR Problems: Beyond HMM Decoding

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)

During the last decade, weighted finite-state transducers (WFSTs) have become popular in speech recognition. While their main field of application remains hidden Markov model (HMM) decoding, the WFST framework is now also seen as a brick in solutions to many other central problems in automatic speech recognition (ASR). These solutions are less known, and this work aims at giving an overview of the applications of WFSTs in large-vocabulary continuous speech recognition (LVCSR) besides HMM decoding: discriminative acoustic model training, Bayes risk decoding, and system combination. The application of the WFST framework has a big practical impact: we show how the framework helps to structure problems, to develop generic solutions, and to delegate complex computations to WFST toolkits. In this paper, we review the literature, discuss existing approaches, and provide new insights into WFST enabled solutions. We also present a novel, purely WFST-based algorithm for computing the exact Bayes risk hypothesis from a lattice with the Levenshtein distance as loss function. We present the problems and their solutions in a unified framework and discuss the advantages and limits of using WFSTs. We do not provide new experimental results, but refer to the existing literature. Our work helps to identify where and how the transducer framework can contribute to a compact and generic solution to LVCSR problems.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:20 ,  Issue: 2 )

Date of Publication:

Feb. 2012

Need Help?

IEEE Advancing Technology for Humanity About IEEE Xplore | Contact | Help | Terms of Use | Nondiscrimination Policy | Site Map | Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest professional association for the advancement of technology.
© Copyright 2014 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.