Cargando…
ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implemen...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789777/ https://www.ncbi.nlm.nih.gov/pubmed/24098098 http://dx.doi.org/10.1371/journal.pcbi.1003234 |
_version_ | 1782286496858898432 |
---|---|
author | Kashiwabara, André Yoshiaki Bonadio, Ígor Onuchic, Vitor Amado, Felipe Mathias, Rafael Durham, Alan Mitchell |
author_facet | Kashiwabara, André Yoshiaki Bonadio, Ígor Onuchic, Vitor Amado, Felipe Mathias, Rafael Durham, Alan Mitchell |
author_sort | Kashiwabara, André Yoshiaki |
collection | PubMed |
description | Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently. |
format | Online Article Text |
id | pubmed-3789777 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-37897772013-10-04 ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data Kashiwabara, André Yoshiaki Bonadio, Ígor Onuchic, Vitor Amado, Felipe Mathias, Rafael Durham, Alan Mitchell PLoS Comput Biol Research Article Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently. Public Library of Science 2013-10-03 /pmc/articles/PMC3789777/ /pubmed/24098098 http://dx.doi.org/10.1371/journal.pcbi.1003234 Text en © 2013 Kashiwabara et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Kashiwabara, André Yoshiaki Bonadio, Ígor Onuchic, Vitor Amado, Felipe Mathias, Rafael Durham, Alan Mitchell ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data |
title | ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data |
title_full | ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data |
title_fullStr | ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data |
title_full_unstemmed | ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data |
title_short | ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data |
title_sort | tops: a framework to manipulate probabilistic models of sequence data |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789777/ https://www.ncbi.nlm.nih.gov/pubmed/24098098 http://dx.doi.org/10.1371/journal.pcbi.1003234 |
work_keys_str_mv | AT kashiwabaraandreyoshiaki topsaframeworktomanipulateprobabilisticmodelsofsequencedata AT bonadioigor topsaframeworktomanipulateprobabilisticmodelsofsequencedata AT onuchicvitor topsaframeworktomanipulateprobabilisticmodelsofsequencedata AT amadofelipe topsaframeworktomanipulateprobabilisticmodelsofsequencedata AT mathiasrafael topsaframeworktomanipulateprobabilisticmodelsofsequencedata AT durhamalanmitchell topsaframeworktomanipulateprobabilisticmodelsofsequencedata |