Bayesian restoration of a hidden Markov chain with applications to DNA sequencing

Citation
Ga. Churchill et B. Lazareva, Bayesian restoration of a hidden Markov chain with applications to DNA sequencing, J COMPUT BI, 6(2), 1999, pp. 261-277
Citations number
36
Language
INGLESE
art.tipo
Article
Categorie Soggetti
Biochemistry & Biophysics
Journal title
JOURNAL OF COMPUTATIONAL BIOLOGY
ISSN journal
1066-5277 → ACNP
Volume
6
Issue
2
Year of publication
1999
Pages
261 - 277
Database
ISI
SICI code
1066-5277(199922)6:2<261:BROAHM>2.0.ZU;2-K
Abstract
Hidden Markov models (HMMs) are a class of stochastic models that have prov en to be powerful tools for the analysis of molecular sequence data. A hidd en Markov model can be viewed as a black box that generates sequences of ob servations. The unobservable internal state of the box is stochastic and is determined by a finite state Markov chain. The observable output is stocha stic with distribution determined by the state of the hidden Markov chain. We present a Bayesian solution to the problem of restoring the sequence of states visited by the hidden Markov chain from a given sequence of observed outputs. Our approach is based on a Monte Carlo Markov chain algorithm tha t allows us to draw samples from the full posterior distribution of the hid den Markov chain paths. The problem of estimating the probability of indivi dual paths and the associated Monte Carlo error of these estimates is addre ssed. The method is illustrated by considering a problem of DNA sequence mu ltiple alignment. The special structure for the hidden Markov model used in the sequence alignment problem is considered in detail. In conclusion, we discuss certain interesting aspects of biological sequence alignments that become accessable through the Bayesian approach to HMM restoration.