By Topic

Parallelization of decision tree algorithm and its performance evaluation

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

4 Author(s)
Kubota, K. ; Real World Comput. Partnership, Kanagawa, Japan ; Nakase, A. ; Sakai, H. ; Oyanagi, S.

Data mining is a typical application of high performance computing in the business field. An efficient data mining system which can deal with huge amount of data is desired. This paper describes the parallel processing of decision tree which is a typical algorithm for classification of large database. A free software C4.5 is parallelized for SMP machine using thread library. Parallelism in generating a decision tree can be classified into intra-node parallelism and inter-node parallelism. Intra-node parallelism can be further classified into record parallelism, attribute parallelism, and their combination. We have implemented these four kinds of parallelizing methods, and evaluated their effects with four kinds of test data. The result shows that there is a relation between the characteristics of data and the parallelizing methods, and combination of multiple parallelizing methods is the most effective one.

Published in:

High Performance Computing in the Asia-Pacific Region, 2000. Proceedings. The Fourth International Conference/Exhibition on  (Volume:2 )

Date of Conference:

14-17 May 2000