LOGOS: a modular Bayesian model for de novo motif detection
Xing, E.P.
Wu, W.
Jordan, M.I.
Karp, R.M.
Div. of Comput. Sci., California Univ., Berkeley, CA, USA;
This paper appears in: Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Publication Date: 11-14 Aug. 2003
On page(s): 266- 276
ISSN:
ISBN: 0-7695-2000-6
INSPEC Accession Number: 7906335
Digital Object Identifier: 10.1109/CSB.2003.1227327
Current Version Published: 2003-09-08
Abstract
The complexity of the global organization and internal structures of motifs in higher eukaryotic organisms raises significant challenges for motif detection techniques. To achieve successful de novo motif detection it is necessary to model the complex dependencies within and among motifs and incorporate biological prior knowledge. In this paper, we present LOGOS, an integrated LOcal and GlObal motif Sequence model for biopolymer sequences, which provides a principled framework for developing, modularizing, extending and computing expressive motif models for complex biopolymer sequence analysis. LOGOS consists of two interacting submodels: HMDM, a local alignment model capturing biological prior knowledge and positional dependence within the motif local structure; and HMM, a global motif distribution model modeling frequencies and dependencies of motif occurrences. Model parameters can be fit using training motifs within an empirical Bayesian framework. A variational EM algorithm is developed for de novo motif detection. LOGOS improves over existing models that ignore biological priors and dependencies in motif structures and motif occurrences, and demonstrates superior performance on both semirealistic test data and cis-regulatory sequences from yeast and Drosophila sequences with regard to sensitivity, specificity, flexibility and extensibility.
Index
Terms
Available to subscribers and IEEE members.
References
Available to subscribers and IEEE members.
Citing Documents
Available to subscribers and IEEE members.