By Topic

Statistical Alignment with a Sequence Evolution Model Allowing Rate Heterogeneity along the Sequence

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Arribas-Gil, A. ; Dept. de Estadistica, Univ. Carlos III de Madrid, Getafe ; Metzler, D. ; Plouhinec, J.-L.

We present a stochastic sequence evolution model to obtain alignments and estimate mutation rates between two homologous sequences. The model allows two possible evolutionary behaviors along a DNA sequence in order to determine conserved regions and take its heterogeneity into account. In our model, the sequence is divided into slow and fast evolution regions. The boundaries between these sections are not known. It is our aim to detect them. The evolution model is based on a fragment insertion and deletion process working on fast regions only and on a substitution process working on fast and slow regions with different rates. This model induces a pair hidden Markov structure at the level of alignments, thus making efficient statistical alignment algorithms possible. We propose two complementary estimation methods, namely, a Gibbs sampler for Bayesian estimation and a stochastic version of the EM algorithm for maximum likelihood estimation. Both algorithms involve the sampling of alignments. We propose a partial alignment sampler, which is computationally less expensive than the typical whole alignment sampler. We show the convergence of the two estimation algorithms when used with this partial sampler. Our algorithms provide consistent estimates for the mutation rates and plausible alignments and sequence segmentations on both simulated and real data.

Published in:

Computational Biology and Bioinformatics, IEEE/ACM Transactions on  (Volume:6 ,  Issue: 2 )