Loading [a11y]/accessibility-menu.js
Discretization of continuous-valued attributes in decision tree generation | IEEE Conference Publication | IEEE Xplore

Discretization of continuous-valued attributes in decision tree generation


Abstract:

Decision tree is one of the most popular and widely used classification models in machine learning. The discretization of continuous-valued attributes plays an important ...Show More

Abstract:

Decision tree is one of the most popular and widely used classification models in machine learning. The discretization of continuous-valued attributes plays an important role in decision tree generation. In this paper, we improve Fayyad's discretization method which uses the average class entropy of candidate partitions to select boundaries for discretization. Our method can reduce the number of candidate boundaries further. Here we also propose a generalized splitting criterion for cut point selection and prove that the cut points are always on boundaries when using this criterion. Along with the formal proof, we present empirical results that the decision trees generated by using such criteria are similar on several datasets from the UCI Machine Learning Repository.
Date of Conference: 11-14 July 2010
Date Added to IEEE Xplore: 20 September 2010
ISBN Information:

ISSN Information:

Conference Location: Qingdao, China

1. Introduction

Decision trees have been widely used in classifications, and many algorithms have been proposed for the construction of decision trees. Classical decision tree algorithms include Concept Learning System (CLS), ID3 [1], [2], C4.5 [3] and Classification and Regression Trees (CART) [4] etc. The purpose of building decision trees is to predict the unknown samples correctly. The tree should reflect the distribution of the data and be as small as possible. It is generally recognized that smaller trees usually have stronger generalization abilities. Most decision trees are generalized using a top-down style method, choosing attributes as the nodes of decision tree under some selection criteria [5]. The growth of decision trees is a process of dividing the training set recursively, the procedure terminate when all samples of each subset belong to the same class.

Contact IEEE to Subscribe

References

References is not available for this document.