Skip to Main Content
A discriminative training algorithm to estimate continuous-density hidden Markov model (CDHMM) for automatic speech recognition is considered. The algorithm is based on the criterion, called margin-enhanced maximum mutual information (MEMMI), and it estimates the CDHMM parameters by maximizing the weighted sum of the maximum mutual information objective function and the large margin objective function. The MEMMI is motivated by the criterion used in such classifier as the soft margin support vector machine that maximizes the weighted sum of the empirical risk function and the margin-related generalization function. The algorithm is an iterative procedure, and at each stage, it updates the parameters by placing different weights on the utterances according to their log likelihood margins: incorrectly-classified (negative margin) utterances are emphasized more than correctly-classified utterances. The MEMMI leads to a simple objective function that can be optimized easily by a gradient ascent algorithm maintaining a probabilistic model. Experimental results show that the recognition accuracy of the MEMMI is better than other discriminative training criteria, such as the approximated maximum mutual information (AMMI), the maximum classification error (MCE), and the soft large margin estimation (SLME) on the TIDIGITS database.