Skip to Main Content
This paper addresses an issue of incorporating topic knowledge to improve Chinese word sense disambiguation. The key is how to learn topic knowledge as features in the design of classifiers for disambiguating word senses. This paper presents two solutions to learn topic knowledge. In the first solution, a Chinese domain knowledge dictionary named NEUKD is used to generate domain feature set. However, due to the limited coverage of the NEUKD, a constrained clustering algorithm is adopted for dictionary expansion. The second method is to build topic feature set by utilizing the Latent Dirichlet Allocation (LDA) algorithm on a large scale unlabeled corpus. Experiments on the SENSEVAL-3 Chinese dataset demonstrated that integrating topic knowledge improve the performance of Chinese word sense disambiguation.