Abstract:
The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motif...Show MoreMetadata
Abstract:
The recognition of regulatory motifs of co-regulated genes is essential for understanding the regulatory mechanisms. However, the automatic extraction of regulatory motifs from a given data set of the upstream noncoding DNA sequences of a family of co-regulated genes is difficult because regulatory motifs are often subtle and inexact. This problem is further complicated by the corruption of the data sets. In this paper, a new approach called mismatch-allowed probabilistic suffix tree motif extraction (MISAE) is proposed. It combines the mismatch-allowed probabilistic suffix tree that is a probabilistic model and local prediction for the extraction of regulatory motifs. The proposed approach is tested on 15 co-regulated gene families and compares favorably with other state-of-the-art approaches. Moreover, MISAE performs well on "corrupted" data sets. It is able to extract the motif from a "corrupted" data set with less than one fourth of the sequences containing the real motif.
Published in: Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004.
Date of Conference: 19-19 August 2004
Date Added to IEEE Xplore: 08 October 2004
Print ISBN:0-7695-2194-0
PubMed ID: 16448011
Department of Computer Science and Engineering, University of Nebraska, Lincoln, Lincolnshire, NE, USA
Department of Computer Science and Engineering, University of Nebraska, Lincoln, Lincolnshire, NE, USA
Department of Computer Science and Engineering, University of Nebraska, Lincoln, Lincolnshire, NE, USA
Department of Computer Science and Engineering, University of Nebraska, Lincoln, Lincolnshire, NE, USA
Department of Computer Science and Engineering, University of Nebraska, Lincoln, Lincolnshire, NE, USA
Department of Computer Science and Engineering, University of Nebraska, Lincoln, Lincolnshire, NE, USA