Skip to Main Content
Boosting algorithm is confirmed as a promising and practical machine learning method which has successfully been applied to some classification problems. Word sense disambiguation system using Boosting acquired the state-of-the-art performance. This paper explores the primary but unavoidable problem of rules selection in Adaboost applied to word sense disambiguation system, presenting the relations among rules selection, iteration number and the performance of the system on sparse data. The results show the increment of the iteration number in Adaboost trained on a small set of examples without noise is neither helpful nor harmful. The algorithm is sensitive to weak rules selection in two aspects: on one hand, some rules make the training error converge more quickly and have higher generalization ability simultaneously, on the other hand, conflictions may occur among weak rules built on different features causing trouble to the whole system.