Skip to Main Content
The automatic error-detecting system is implemented on a hybrid algorithm combining with n-gram model, dependency parsing, Hownet and rules created and used to detect different error types of Chinese text. First of all, four different n-gram models are employed to analyze the separate strings in the segmented texts and detect the lexical errors, and experiments are made on the frequency statistics of words and characters. Subsequently, dependency parsing and Hownet are introduced into automatic proofreading and help to detect semantic errors. Dependency grammar parses the whole sentence and denotes dominating and dominated relation among the words, combined with Hownet can efficiently check the semantic errors. In addition, grammatical collocation rules are made to check the syntax errors, in order to fill up the deficiency of the two methods above. Finally an ideal automatically detecting error system is obtained with precision of 69.66% and recall of 84.16%.