Skip to Main Content
In processing ill-formed spontaneous spoken utterance, many state-of-the-art robust parsers achieve robustness by allowing skipping of words and rule symbols. The parser's ability to skip words and rule symbols, however, results in a much bigger search space and greatly increases the parse ambiguity. Previous approaches resolved these issues through manually labeling the types of rule symbols, or by utilizing heuristic scores or statistical probabilities. However, these approaches have certain drawbacks. This paper proposes to exploit embedded machine learning techniques to help with pruning and disambiguation in robust parsers. An embedded machine learning system is integrated with the heuristic score and the strategy of basing the types of rule symbols upon their correspondence to the domain model. This integration can considerably relieve the reliance of robust parser development on linguistic expert handcrafting. Our experiments show that this integration offers stronger capability in ambiguity resolution, thereby enabling the robust parser to achieve better parsing accuracy.