Skip to Main Content
Gene set analysis shows great advantages of finding significant gene categories where genes are involved in relative biological processes or share similar functions. Available tools for gene set analysis are limited for the analysis of microarray experiments with few repeats and also tend to generate false positives for the gene sets containing large number of genes. We present a new method named SGS for finding significant gene sets, in which genes are differentially expressed. The methodology is based on the view that genes being more differentially expressed play more important roles in the gene expression profile. Therefore, a weighted distribution of gene expression is included to calculate the extent of up-regulation and down-regulation of the gene set. Two kinds of cutoffs are introduced to determine the gene sets which are both biological reasonable and statistical significant. Our method can effectively decrease the false positive predictions caused by the large size of gene set. To suit the analysis of microarray data with various experimental designs, including few repeats or multiple conditions, three models were proposed in SGS. The gene expression data from microarray experiments on type II diabetes was analyzed to test the performance of SGS. Under a comparison to GSEA which is one of the most widely used gene set analysis tool, it shows that SGS finds out more gene sets related to oxidative phosphoration and ribosome, and excludes gene sets which do not belong to these two properties. The assessment indicates that the new tool performs with higher accuracy and lower false positive rate.
Date of Conference: 10-12 May 2011