Skip to Main Content
In this letter, we investigate the interactions of front-end feature extraction and back-end classification techniques in nonstationary state hidden Markov model (NSHMM) based speech recognition. The proposed model aims at finding an optimal linear transformation on the mel-warped discrete Fourier transform (DFT) features according to the minimum classification error (MCE) criterion. This linear transformation, along with the NSHMM parameters, are automatically trained using the gradient descent method. An error rate reduction of 8% is obtained on a standard 39-class TIMIT phone classification task in comparison with the MCE-trained NSHMM using conventional preprocessing techniques.