Skip to Main Content
A new divergence-based approach to feature selection for naive Bayes text classification is proposed in this paper. In this approach, the discrimination power of each feature is directly used for ranking various features through a criterion named overall-divergence, which is based on the divergence measures evaluated between various class density function pairs. Compared with other state-of-the-art algorithms (e.g. IG and CHI), the proposed approach shows more discrimination power for classifying confusing classes, and achieves better or comparable performance on evaluation data sets.