Scheduled System Maintenance:
Some services will be unavailable Sunday, March 29th through Monday, March 30th. We apologize for the inconvenience.
By Topic

Bayesian Sensing Hidden Markov Models

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Saon, G. ; IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA ; Jen-Tzung Chien

In this paper, we introduce Bayesian sensing hidden Markov models (BS-HMMs) to represent sequential data based on a set of state-dependent basis vectors. The goal of this work is to perform Bayesian sensing and model regularization for heterogeneous training data. By incorporating a prior density on sensing weights, the relevance of different bases to a feature vector is determined by the corresponding precision parameters. The BS-HMM parameters, consisting of the basis vectors, the precision matrices of sensing weights and the precision matrices of reconstruction errors, are jointly estimated by maximizing the likelihood function, which is marginalized over the weight priors. We derive recursive solutions for the three parameters, which are expressed via maximum a posteriori estimates of the sensing weights. We specifically optimize BS-HMMs for large-vocabulary continuous speech recognition (LVCSR) by introducing a mixture model of BS-HMMs and by adapting the basis vectors to different speakers. Discriminative training of BS-HMMs in the model domain and the feature domain is also proposed. Experimental results on an LVCSR task show consistent improvements due to the three sets of BS-HMM parameters and demonstrate how the extensions of mixture models, speaker adaptation, and discriminative training achieve better recognition results compared to those of conventional HMMs based on Gaussian mixture models.

Published in:

Audio, Speech, and Language Processing, IEEE Transactions on  (Volume:20 ,  Issue: 1 )