Skip to Main Content
Identification and analysis of tissue-specific (TS) genes and their regulatory activities play an important role in the understanding of mechanisms of organisms, disease diagnosis and drug design. In this paper, we designed a pipeline for the discovery of promoter motifs for tissue-specific genes. The pipeline consists of three phases: motif searching, motif merging and motif validation. The motif searching phase integrated three algorithms: MEME, AlignACE and Gibbs Sampling. In the second phase, we proposed a motif merging method, which is based on Bayesian probabilistic principles, to reduce redundancies of motifs from the first phase. Lastly, the motif validation phase verified the statistical significance of discovered motifs using a Bayesian Hypothesis Test approach. We performed the analysis on the sequences of promoter regions (-449bp-1000bp) of 4,552 human tissue-specific genes across 82 tissues and 924 housekeeping genes. The distributions of motifs in different promoter regions show that most motifs prefer to be in the proximal region (+500~50bp, -50bp~-500bp) of promoters.