By Topic

Data sampling ensemble acoustic modelling in speaker independent speech recognition

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Xin Chen ; Department of Computer Science, University of Missouri, Columbia, 65211 USA ; Yunxin Zhao

In this paper, we extend our recent data-sampling based ensemble acoustic modeling technique for the speaker-independent task of TIMIT and propose new methods to further improve the effectiveness of the ensemble acoustic models. We propose applying overlapped speaker clustering in data sampling to construct an ensemble of acoustic models for speaker independent speech recognition. In addition, we evaluate the method of data sampling in recurrent neural network for constructing a RNN based frame classifier. We also investigate using CVEM in place of EM in our ensemble acoustic model training. By using these methods on the speaker independent TIMIT phone recognition task, we have obtained a 2.5% absolute gain on phone accuracy over a standard HMM baseline system.

Published in:

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

Date of Conference:

14-19 March 2010