Current Speech-based dialog system undergo a practical problem; a speech recognizer is defective due to inevitable errors. Even in multimodal dialog systems, which have multiple input channels, errors in the speech recognition are a major problem because speech contains a large portion of user's intention. In this paper, we propose a re-ranking method to improve the performance of speech recognition in a multimodal dialog system. To re-rank the n-best speech recognition hypotheses, we use the multimodal understanding features that are orthogonal to the speech as well as the speech recognizer features. We demonstrate our method to smart home domain, and the results show that the multimodal understanding features are promising in overcoming many speech errors.
Published in:
Robot and Human interactive Communication, 2007. RO-MAN 2007. The 16th IEEE International Symposium on
Date of Conference: 26-29 Aug. 2007