Skip to Main Content
In this paper, we present a general algorithmic framework based on WFSTs for implementing a variety of discriminative training methods, such as MMI, MCE, and MPE/MWE. In contrast to the ordinary word lattices, the transducer-based lattices are more amenable to representing and manipulating the underlying hypothesis space and have a finer granularity at the HMM-state level. The transducers are processed into a two-layer hierarchy: at a high level, it is analogous to a word lattice, and each word transition embodies an HMM-state subgraph for that word at a lower level. This hierarchy combined with the appropriate customization of the transducers leads to a flexible implementation for all of the training criteria being discussed. The effectiveness of the framework is verified on two speech recognition tasks: Resource Management, and AT&T SCANMail, an internal voicemail-to-text task.