Abstract:
Dealing with continuous-valued attributes is an important data mining problem that has effects on accuracy, complexity, and understandability of the mining algorithms. Th...Show MoreMetadata
Abstract:
Dealing with continuous-valued attributes is an important data mining problem that has effects on accuracy, complexity, and understandability of the mining algorithms. This paper presents a new approach for dealing with continuous attributes that improve the quality of discretization as a preprocessing step for decision tree and naïve Bayesian classifier. The proposed approach focus on supervised discretization, however, unsupervised discretization can also be applied in the same way. It finds the possible cut points with the attribute values of continuous attribute that can separate the class distributions, and then consider the best cut point as an interval border with information gain heuristic and Bayesian classifier. The proposed approach has been tested by comparing with other discretization methods on a number of benchmark problems from UCI machine learning repository. The experimental results proved that the proposed approach for discretization of continuous attributes improves the quality of discretization.
Date of Conference: 22-24 December 2011
Date Added to IEEE Xplore: 09 March 2012
ISBN Information: