Skip to Main Content
We develop a sticky hidden Markov model (HMM) with a Dirichlet distribution (DD) prior, motivated by the problem of analyzing comparative genomic hybridization (CGH) data. As formulated the sticky DD-HMM prior is employed to infer the number of states in an HMM, while also imposing state persistence. The form of the proposed hierarchical model allows efficient variational Bayesian (VB) inference, of interest for large-scale CGH problems. We compare alternative formulations of the sticky HMM, while also examining the relative efficacy of VB and Markov chain Monte Carlo (MCMC) inference. To validate the formulation, example results are presented for an illustrative synthesized data set and our main application-CGH, for which we consider data for breast cancer. For the latter, we also make comparisons and partially validate the CGH analysis through factor analysis of associated (but distinct) gene-expression data.