Skip to Main Content
In a previous paper, extensions of the 2-level stochastic speech understanding system have been proposed. Firstly the 3-level system is obtained through the introduction of a stochastic concept value normalization module. Then the 2+1-level system is obtained as a degraded 3-level system where the conceptual decoding and value normalization steps are decoupled, thus allowing to greatly reduce the model complexity and improve its trainability. In this paper, a multi-level spoken language understanding system is presented. This stochastic module is for the first time based on dynamic Bayesian networks. Factored language models with a generalized parallel backoff procedure are used as edge implementation to provide efficiently smoothed conditional probability estimates. This framework allows a great flexibility in terms of probability representation facilitating the development of the stochastic levels of the system. The proposed approaches, 3-level and 2+1-level, are evaluated on the French MEDIA task (tourist information and hotel booking). The MEDIA 10k-utterance training corpus is segmentally annotated, allowing a direct training of the various levels of the conceptual models. The best DBN-based system obtains performance comparable to those of the MEDIA'05 evaluation campaign best system (H. Bonneau-Maynard et al., 2005).
Spoken Language Technology Workshop, 2006. IEEE
Date of Conference: 10-13 Dec. 2006