By Topic

Semantics synchronous understanding for robust spoken language applications

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

1 Author(s)
Kuansan Wang ; Speech Technol. Group, Microsoft Res., Redmond, WA, USA

In this paper, we describe our recent effort in combining speech recognition and understanding into a single pass decoding process. The goal is to utilize the semantic structure not only to better handle disfluencies and improve the overall understanding accuracy, but also to shorten the response time and achieve higher interactivity. Three related techniques are instrumental in our approach. First, we employ the unified language model (ULM) to incorporate semantic schema into the recognition language model, and extend the search process from word synchronous to semantic object synchronous (SOS) decoding. Finally, we utilize sequential detection to defer, reject, or accept semantic hypotheses and execute consequent dialog actions while the user's utterance is ongoing. We incorporated these methods into SALT and HTML and conducted comparative user studies based on the MiPad scenarios. The experimental results show the system can gracefully cope with spontaneous speech and the users prefer the highly interactive nature of such systems even though there are no significant differences in the task completion rate and the understanding accuracy. However, the interactive interface does allow a more effective visual prompting strategy that contributes to the significantly lower out of grammar utterances.

Published in:

Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

Date of Conference:

30 Nov.-3 Dec. 2003