By Topic

Naïve Bayes text classification with positive features selected by statistical method

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Meena, M.J. ; Dept. of CSE, PSG Coll. of Technol., Coimbatore, India ; Chandran, K.R.

Text classification is enduring to be one of the most researched problems due to continuously-increasing amount of electronic documents and digital data. Naive Bayes is an effective and a simple classifier for data mining tasks, but does not show much satisfactory results in automatic text classification problems. In this paper, the performance of naive Bayes classifier is analyzed by training the classifier with only the positive features selected by CHIR, a statistics based method as input. Feature selection is the most important preprocessing step that improves the efficiency and accuracy of text classification algorithms by removing redundant and irrelevant terms from the training corpus. Experiments were conducted for randomly selected training sets and the performance of the classifier with words as features was analyzed. The proposed method achieves higher classification accuracy compared to other native methods for the 20Newsgroup benchmark.

Published in:

Advanced Computing, 2009. ICAC 2009. First International Conference on

Date of Conference:

13-15 Dec. 2009