Loading [MathJax]/extensions/MathMenu.js
Multiple time resolution analysis of speech signal using MCE training with application to speech recognition | IEEE Conference Publication | IEEE Xplore

Multiple time resolution analysis of speech signal using MCE training with application to speech recognition


Abstract:

In this paper, we propose two methods of multiple time-resolution analysis of speech and their application to automatic speech recognition (ASR). Constant frame-rate mult...Show More

Abstract:

In this paper, we propose two methods of multiple time-resolution analysis of speech and their application to automatic speech recognition (ASR). Constant frame-rate multi-scale analysis is proposed based on a box of multi-scale features. Then a variable rate analysis is proposed based on the selection of the optimal temporal resolution on the fly by a properly trained non-linear classifier unit. The classifier's parameters are trained using the discriminative method of minimum classification error (MCE) training. We use the recently proposed conditional random fields (CRF) phonetic recognition system that effectively combines highly correlated features. Results are reported on a frame-wise classification task and also on TIMIT phone recognition task. Results show that (i) CRFs can effectively combine multi-scale features and (ii) MCE trained variable rate CRFs are competitive with the ldquoboxrdquo combination method.
Date of Conference: 19-24 April 2009
Date Added to IEEE Xplore: 26 May 2009
ISBN Information:

ISSN Information:

Conference Location: Taipei, Taiwan

Contact IEEE to Subscribe

References

References is not available for this document.