Chinese Verb Subcategorization Acquisition from Noisy Data on Sentence Level | IEEE Conference Publication | IEEE Xplore

Chinese Verb Subcategorization Acquisition from Noisy Data on Sentence Level


Abstract:

Subcategorization is the process that further classifies a syntactic category into its subsets. Aiming to improve the recall of acquisition, we design an automatic approa...Show More

Abstract:

Subcategorization is the process that further classifies a syntactic category into its subsets. Aiming to improve the recall of acquisition, we design an automatic approach of enriching the argument knowledge of SCF by means of active learning and employing a multi-class SVM model to classify argument type. We could thus give an accurate SCF as output for each input sentence, even on noisy data, meanwhile avoiding writing rules by hand. Our approach generates hypothesis directly without statistical filtering as the next step after generation. Experiments results indicate that the acquisition performance is significantly improved especially in the aspect of recall, which was increased from 88.83 to 99.75 in open test.
Date of Conference: 31 March 2009 - 02 April 2009
Date Added to IEEE Xplore: 24 July 2009
Print ISBN:978-0-7695-3507-4
Conference Location: Los Angeles, CA, USA

1. Introduction

Research into the automatic acquisition of subcategorization frame (SCFs) on large-scale corpus has made great progress in recent years. As a result many knowledge bases related are established about different languages, including Chinese verb subcategorization lexicon, which contain distribution of SCFs and instance of sentence.

Contact IEEE to Subscribe

References

References is not available for this document.