Skip to Main Content
High throughput gene expression technologies have been widely used in many biological fields. Typical analysis of gene expression data is to find similarly expressed gene groups by clustering approaches and to identify differentially expressed genes by statistical approaches. The analysis, however, still has a difficulty in interpreting molecular level interaction or signaling transduction based on prior biological information. Recently, a Gene Set Analysis (GSA) approach was developed by a MIT group, which paved the first way for inferring molecular pathway mechanisms behind differentially express genes among sample groups. Current GSA approaches do not take hierarchical regulation among gene entries based on prior pathway information (e.g., KEGG pathways) into consideration. Our proposed approach is that GSA can be expanded not only to reflect the hierarchical structures among genes but also to identify specific subpathways that statistically agree with gene expression data as well as that could explain molecular level mechanism differences between two sample groups. We obtained the KEGG pathways (http://www.genome.jp/kegg/pathway.html) of which nodes and edges were taken into consideration by a probabilistic model. A statistic was calculated for each subpathway in every KEGG pathway based on the model. We identified significant subpathways in an expression dataset.
Date of Conference: 18-18 Dec. 2010