Skip to Main Content
Identification of disease genes that might anticipate the clinical behavior of human cancers is very important for understanding cancer pathogenesis. Computational analysis of disease gene from microarray data involves a search for gene subset that is able to discriminate cancer samples from normal samples, which is a challenging task due to a small number of samples compared to huge number of genes. In this paper, an algorithm (LRSVD) based on singular value decomposition and logistic regression is proposed to find genes that are associated with disease. LRSVD makes use of a threshold value to control the number of singular vectors; evaluates the contribution of each eigengene to the classifying accuracy by regression coefficients of logistic regression; and then ranks each gene by its discriminative power for two kinds of samples. The results on colon gene expression data indicate that LRSVD method with support vector machine (SVM) as a classifier is an encouraging method to identify disease genes.