Skip to Main Content
The whole genome DNA microarrays make it possible to monitor the expression of nearly all the genes in an organism and have been widely used in scientific and industrial fields. The challenges no longer lie in obtaining the data, but rather in interpreting the results to reveal the mechanisms of biological significance. A recent established method GSEA assesses whether priori defined gene sets shows statistically significant, concordant differences between two biological states. This knowledge-based modular level analysis method proved to be superior to traditional single gene-based method, which is also demonstrated by several improvements base on the concept of GSEA. However, GSEA was designed to work on a ranked list of genes, so knowledge-based analysis of other data types remains a challenge. In this study, we have proposed a framework for gene set analysis of three major data types, ranked genes, clustered genes and signature genes. More interestingly, we further extended these methods to de novo motif discovery in almost the same framework. Analysis on real microarray data showed that results of biological significance could be recovered. The R scripts for Knowledge-based Integrative Analysis of Microarray data (KIAM) are available upon request from the authors.