Skip to Main Content
Transcription factors play important roles in gene regulation. An accurate model that can describe the binding site of a transcription factor in the promoter region of a gene is thus the key for understanding the regulation of the gene. In this paper, we develop a new graph theoretical approach that can efficiently extract features from the binding sites of a transcription factor. These features contain the dependencies among different positions in the binding site and thus can provide a more accurate description of binding sites than models based on the conventional position specific scoring matrix (PSSM). Based on these features, statistical models can be constructed to describe the binding sites of a transcriptional factor. Our testing results showed that models constructed with our approach can find important features for binding sites and achieve significantly improved accuracy for predicting the locations of binding sites in DNA genomes.