Skip to Main Content
This paper compares two methodologically different approaches to gene set analysis applied for selection of features for sample classification based on microarray studies. We analyze competitive and self-contained methods in terms of predictive performance of features generated from most differentially expressed gene sets (pathways) identified with these approaches. We also observe stability of features returned. We use the features to train several classifiers (e.g., SVM, random forest, nearest shrunken centroids, etc.) We generally observe smaller classification errors and better stability of features produced by the self-contained algorithm. This comparative study is based on the leukemia data set published in .