By Topic

Speech feature analysis using variational Bayesian PCA

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Oh-Wook Kwon ; Inst. for Neural Comput., Univ. of California, La Jolla, CA, USA ; Kwokleung Chan ; Te-Won Lee

In most hidden Markov model-based automatic speech recognition systems, one of the fundamental questions is to determine the intrinsic speech feature dimensionality and the number of clusters used on the Gaussian mixture model. We analyzed mel-frequency band energies using a variational Bayesian principal component analysis method to estimate the feature dimensionality as well as the number of Gaussian mixtures by learning a maximum lower bound of the evidence instead of maximizing the likelihood function as used in conventional speech recognition systems. In analyzing the Texas Instruments/Massachusetts Institute of Technology (TIMIT) speech database, our method revealed the intrinsic structures of vowels and consonants. The usefulness of this method is demonstrated in the superior classification performance for the most difficult phonemes /b/, /d/, and /g/.

Published in:

IEEE Signal Processing Letters  (Volume:10 ,  Issue: 5 )