This paper presents a method for selecting informative features using K-Means clustering and SNR ranking. The performance of the proposed method was tested on cancer classification problems. Genetic Programming is employed as a classifier. The experimental results indicate that the proposed method yields higher accuracy than using the SNR ranking alone and higher than using all of the genes in classification. The clustering step assures that the selected genes have low redundancy, hence the classifier can exploit these features to obtain better performance.
Published in:
Frontiers in the Convergence of Bioscience and Information Technologies, 2007. FBIT 2007
Date of Conference: 11-13 Oct. 2007