Skip to Main Content
Feature selection is considered to be an important step in the analysis of transcriptomes or gene expression data. Carrying out feature selection reduces the curse of the dimensionality problem and improves the interpretability of the problem. Numerous feature selection methods have been proposed in the literature and these methods rank the genes in order of their relative importance. However, most of these methods determine the number of genes to be used in an arbitraryly or heuristic fashion. Proposed is a theoretical way to determine the optimal number of genes to be selected for a given task. This proposed strategy has been applied on a number of gene expression datasets and promising results have been obtained.