Skip to Main Content
The identification of regulatory signals is one of the most challenging tasks in bioinformatics. The development of gene-profiling technologies now makes it possible to obtain vast data on gene expression in a particular organism under various conditions. This has created the opportunity to identify and analyze the parts of the genome believed to be responsible for transcription control-the transcription factor DNA-binding motifs (TFBMs). Developing a practical and efficient computational tool to identify TFBMs will enable us to better understand the interplay among thousands of genes in a complex eukaryotic organism. This problem, which is mathematically formulated as the motif finding problem in computer science, has been studied extensively in recent years. We develop a new mathematical model and approximation technique for motif searching. Based on the graph theoretic and geometric properties of this approach, we propose a nonstatistical approximation algorithm to find motifs in a set of genome sequences.