Skip to Main Content
Numerous recent studies have shown that microarray gene expression data is useful for cancer classification. Classification based on microarray data is very different from previous classification problems in that the number of features (genes) greatly exceeds the number of instances (tissue samples). It has been shown that selecting a small set of informative genes can lead to improved classification accuracy. It is thus important to first apply feature selection methods prior to classification. In the machine learning field, one of the most successful feature filtering algorithms is the Relief-F algorithm. In this work, we empirically evaluate its performance on three published cancer classification data sets. We use the linear SVM and the k-NN as classifiers in the experiments, and compare the performance of Relief-F with other feature filtering methods, including Information Gain, Gain Ratio, and χ2-statistic. Using the leave-one-out cross validation, experimental results show that the performance of Relief-F is comparable with other methods.
Date of Conference: 16-19 Aug. 2004