Skip to Main Content
Among the large number of genes presented in microarray data, only a small fraction of them are effective for performing a certain diagnostic test. However, it is very difficult to identify these genes for disease diagnosis. In this regard, a new supervised gene clustering algorithm is proposed to cluster genes from microarray data. The proposed method directly incorporates the information of response variables in the grouping process for finding such groups of genes. Significant cluster representatives are then taken to form the reduced feature set that can be used to build the classifiers with very high classification accuracy. The effectiveness of the proposed method, along with a comparison with existing methods, is demonstrated on three microarray data sets based on predictive accuracy of the naive Bayes' classifier, the K-nearest neighbor rule, and the support vector machine.