Skip to Main Content
Discriminative training is an essential part in building a state-of-the-art speech recognition system. The Extended Baum–Welch (EBW) algorithm is the most popular method to carry out this demanding large-scale optimization task. This paper presents a novel analysis of the EBW algorithm which shows that EBW is performing a specific kind of constrained optimization. The constraints show an interesting connection between the improvement of the discriminative criterion and the Kullback–Leibler divergence (KLD). Based on the analysis, a novel method for controlling the EBW algorithm is proposed. The presented analysis uses decomposed formulae for Gaussian mixture KLDs which correspond to the ones used in the Constrained Line Search (CLS) optimization algorithm. The CLS algorithm for discriminative training is therefore also briefly presented and its connections to EBW studied. Large vocabulary speech recognition experiments are used to evaluate the proposed controlling of EBW, which is shown to outperform the common heuristics in model robustness. Comparison of EBW to CLS also shows differences in robustness in favor to EBW. The constraints for Gaussian parameter optimization as well as the special mixture weight estimation method used with EBW are shown to be the key factors for good performance.