By Topic

An improved naive Bayesian classifier technique coupled with a novel input solution method [rainfall prediction]

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
J. N. K. Liu ; Dept. of Comput., Hong Kong Polytech. Univ., Hung Hom, China ; B. N. L. Li ; T. S. Dillon

Data mining is the study of how to determine underlying patterns in the data to help make optimal decisions on computers when the database involved is voluminous, hard to characterize accurately and constantly changing. It deploys techniques based on machine learning alongside more conventional methods. These techniques can generate decision or prediction models based on actual historical data. Therefore, they represent true evidence-based decision support. Rainfall prediction is a good problem to solve by data mining techniques. This paper proposes an improved naive Bayes classifier (INCB) technique and explores the use of genetic algorithms (GAs) for the selection of a subset of input features in classification problems. It then carries out a comparison with several other techniques. It compares the following algorithms on real meteorological data in Hong Kong: (1) genetic algorithms with average classification or general classification (GA-AC and GA-C), (2) C4.5 with pruning, and (3) INBC with relative frequency or initial probability density (INBC-RF and INBC-IPD). Two simple schemes are proposed to construct a suitable data set for improving their performance. Scheme I uses all the basic input parameters for rainfall prediction. Scheme II uses the optimal subset of input variables which are selected by a GA. The results show that, among the methods we compared, INBC achieved about a 90% accuracy rate on the rain/no-rain classification problems. This method also attained reasonable performance on rainfall prediction with three-level depth and five-level depth, which are around 65%-70%

Published in:

IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews)  (Volume:31 ,  Issue: 2 )