By Topic

Decision tree learning on very large data sets

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Hall, L.O. ; Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA ; Chawla, N. ; Bowyer, K.W.

Consider a labeled data set of 1 terabyte in size. A salient subset might depend upon the users interests. Clearly, browsing such a large data set to find interesting areas would be very time consuming. An intelligent agent which, for a given class of user, could provide hints on areas of the data that might interest the user would be very useful. Given large data sets having categories of salience for different user classes attached to the data in them, these labeled sets of data can be used to train a decision tree to label unseen data examples with a category of salience. The training set will be much larger than usual. This paper describes an approach to generating the rules for an agent from a large training set. A set of decision trees are built in parallel on tractable size training data sets which are a subset of the original data. Each learned decision tree will be reduced to a set of rules, conflicting rules resolved and the resultant rules merged into one set. Results from cross validation experiments on a data set suggest this approach may be effectively applied to large sets of data.

Published in:

Systems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on  (Volume:3 )

Date of Conference:

14-14 Oct. 1998