By Topic

Robustness analysis of diversified ensemble decision tree algorithms for Microarray data classification

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Hong Hu ; Department of Mathematics and Computing, University of Southern Queensland, Toowoomba, QLD 4350 Australia ; Jiu-Yong Li ; Hua Wang ; Grant Daggard
more authors

Ensemble classification methods have shown promise for achieving higher classification accuracy for microarray data classification analysis. As noise values do exist in all microarray data even after microarray data preprocessing stage, robustness is therefore another very important criteria in addition to accuracy for evaluating reliable microarray classification algorithms. In this paper, we conduct experimental comparison of our newly developed MDMT with C4.5, BaggingC4.5, Ad-aBoostingC4.5, Random Forest and CS4 on four microarray cancer data sets. We test and evaluate how well a given single or ensemble classifier can tolerate noise data in unseen test datasets, particularly with increasing levels of noise. The experimental results show that MDMT tolerates the noise values in unseen test data sets better than other compared methods do, particularly with increasing levels of noise data. We observe that a random forests is comparable to MDMT in term of resistance to noise. The experimental results also show that ensemble decision tree methods tolerate the noise values better than single tree C4.5 does. We conclude that avoiding overlapping genes exist among the ensemble trees is an intuitive, simple and effective way to achieve higher degree of diversity for ensemble decision tree methods. The algorithm based on this principal is more reliable to deal with microarray data sets with certain level of noise data.

Published in:

2008 International Conference on Machine Learning and Cybernetics  (Volume:1 )

Date of Conference:

12-15 July 2008