Cargando…

ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data

Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implemen...

Descripción completa

Detalles Bibliográficos
Autores principales: Kashiwabara, André Yoshiaki, Bonadio, Ígor, Onuchic, Vitor, Amado, Felipe, Mathias, Rafael, Durham, Alan Mitchell
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789777/
https://www.ncbi.nlm.nih.gov/pubmed/24098098
http://dx.doi.org/10.1371/journal.pcbi.1003234
_version_ 1782286496858898432
author Kashiwabara, André Yoshiaki
Bonadio, Ígor
Onuchic, Vitor
Amado, Felipe
Mathias, Rafael
Durham, Alan Mitchell
author_facet Kashiwabara, André Yoshiaki
Bonadio, Ígor
Onuchic, Vitor
Amado, Felipe
Mathias, Rafael
Durham, Alan Mitchell
author_sort Kashiwabara, André Yoshiaki
collection PubMed
description Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently.
format Online
Article
Text
id pubmed-3789777
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-37897772013-10-04 ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data Kashiwabara, André Yoshiaki Bonadio, Ígor Onuchic, Vitor Amado, Felipe Mathias, Rafael Durham, Alan Mitchell PLoS Comput Biol Research Article Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by combining eight kinds of models: (i) independent and identically distributed process; (ii) variable-length Markov chain; (iii) inhomogeneous Markov chain; (iv) hidden Markov model; (v) profile hidden Markov model; (vi) pair hidden Markov model; (vii) generalized hidden Markov model; and (viii) similarity based sequence weighting. The framework includes functionality for training, simulation and decoding of the models. Additionally, it provides two methods to help parameter setting: Akaike and Bayesian information criteria (AIC and BIC). The models can be used stand-alone, combined in Bayesian classifiers, or included in more complex, multi-model, probabilistic architectures using GHMMs. In particular the framework provides a novel, flexible, implementation of decoding in GHMMs that detects when the architecture can be traversed efficiently. Public Library of Science 2013-10-03 /pmc/articles/PMC3789777/ /pubmed/24098098 http://dx.doi.org/10.1371/journal.pcbi.1003234 Text en © 2013 Kashiwabara et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Kashiwabara, André Yoshiaki
Bonadio, Ígor
Onuchic, Vitor
Amado, Felipe
Mathias, Rafael
Durham, Alan Mitchell
ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
title ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
title_full ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
title_fullStr ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
title_full_unstemmed ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
title_short ToPS: A Framework to Manipulate Probabilistic Models of Sequence Data
title_sort tops: a framework to manipulate probabilistic models of sequence data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789777/
https://www.ncbi.nlm.nih.gov/pubmed/24098098
http://dx.doi.org/10.1371/journal.pcbi.1003234
work_keys_str_mv AT kashiwabaraandreyoshiaki topsaframeworktomanipulateprobabilisticmodelsofsequencedata
AT bonadioigor topsaframeworktomanipulateprobabilisticmodelsofsequencedata
AT onuchicvitor topsaframeworktomanipulateprobabilisticmodelsofsequencedata
AT amadofelipe topsaframeworktomanipulateprobabilisticmodelsofsequencedata
AT mathiasrafael topsaframeworktomanipulateprobabilisticmodelsofsequencedata
AT durhamalanmitchell topsaframeworktomanipulateprobabilisticmodelsofsequencedata