Feature selection is a procedure to select highly informative features. The microarray data comprises less number of samples with more number of genes. Feature selection for gene expression data intends to find a set of genes that best discriminate highly expressed genes from highly suppressed genes. The supervised feature selection methods select features using evaluation function or metric that is related to the decision classes. However, Gene expression datasets are continuous and for such dataset, decision class is not provided. The existing unsupervised feature selection methods are not effective in selecting features which comprises real valued data. Discretizing the original data, results in information loss. An efficient Unsupervised Tolerance Rough Set based Relative Reduct (U-TRS-RelRed) algorithm is presented in this paper. This algorithm uses backward elimination method to remove features from the complete set of original features. K-Means and Rough K-Means algorithms are used to cluster and measure the quality of the reduced data. The proposed approach is compared with existing unsupervised methods and the result demonstrates the efficiency of the proposed algorithm.
Published in:
Advances in Engineering, Science and Management (ICAESM), 2012 International Conference on
Date of Conference: 30-31 March 2012