Cargando…

Evolutionary models for insertions and deletions in a probabilistic modeling framework

BACKGROUND: Probabilistic models for sequence comparison (such as hidden Markov models and pair hidden Markov models for proteins and mRNAs, or their context-free grammar counterparts for structural RNAs) often assume a fixed degree of divergence. Ideally we would like these models to be conditional...

Descripción completa

Detalles Bibliográficos
Autor principal: Rivas, Elena
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1087829/
https://www.ncbi.nlm.nih.gov/pubmed/15780137
http://dx.doi.org/10.1186/1471-2105-6-63
_version_ 1782123824331882496
author Rivas, Elena
author_facet Rivas, Elena
author_sort Rivas, Elena
collection PubMed
description BACKGROUND: Probabilistic models for sequence comparison (such as hidden Markov models and pair hidden Markov models for proteins and mRNAs, or their context-free grammar counterparts for structural RNAs) often assume a fixed degree of divergence. Ideally we would like these models to be conditional on evolutionary divergence time. Probabilistic models of substitution events are well established, but there has not been a completely satisfactory theoretical framework for modeling insertion and deletion events. RESULTS: I have developed a method for extending standard Markov substitution models to include gap characters, and another method for the evolution of state transition probabilities in a probabilistic model. These methods use instantaneous rate matrices in a way that is more general than those used for substitution processes, and are sufficient to provide time-dependent models for standard linear and affine gap penalties, respectively. Given a probabilistic model, we can make all of its emission probabilities (including gap characters) and all its transition probabilities conditional on a chosen divergence time. To do this, we only need to know the parameters of the model at one particular divergence time instance, as well as the parameters of the model at the two extremes of zero and infinite divergence. I have implemented these methods in a new generation of the RNA genefinder QRNA (eQRNA). CONCLUSION: These methods can be applied to incorporate evolutionary models of insertions and deletions into any hidden Markov model or stochastic context-free grammar, in a pair or profile form, for sequence modeling.
format Text
id pubmed-1087829
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-10878292005-04-30 Evolutionary models for insertions and deletions in a probabilistic modeling framework Rivas, Elena BMC Bioinformatics Research Article BACKGROUND: Probabilistic models for sequence comparison (such as hidden Markov models and pair hidden Markov models for proteins and mRNAs, or their context-free grammar counterparts for structural RNAs) often assume a fixed degree of divergence. Ideally we would like these models to be conditional on evolutionary divergence time. Probabilistic models of substitution events are well established, but there has not been a completely satisfactory theoretical framework for modeling insertion and deletion events. RESULTS: I have developed a method for extending standard Markov substitution models to include gap characters, and another method for the evolution of state transition probabilities in a probabilistic model. These methods use instantaneous rate matrices in a way that is more general than those used for substitution processes, and are sufficient to provide time-dependent models for standard linear and affine gap penalties, respectively. Given a probabilistic model, we can make all of its emission probabilities (including gap characters) and all its transition probabilities conditional on a chosen divergence time. To do this, we only need to know the parameters of the model at one particular divergence time instance, as well as the parameters of the model at the two extremes of zero and infinite divergence. I have implemented these methods in a new generation of the RNA genefinder QRNA (eQRNA). CONCLUSION: These methods can be applied to incorporate evolutionary models of insertions and deletions into any hidden Markov model or stochastic context-free grammar, in a pair or profile form, for sequence modeling. BioMed Central 2005-03-21 /pmc/articles/PMC1087829/ /pubmed/15780137 http://dx.doi.org/10.1186/1471-2105-6-63 Text en Copyright © 2005 Rivas; licensee BioMed Central Ltd.
spellingShingle Research Article
Rivas, Elena
Evolutionary models for insertions and deletions in a probabilistic modeling framework
title Evolutionary models for insertions and deletions in a probabilistic modeling framework
title_full Evolutionary models for insertions and deletions in a probabilistic modeling framework
title_fullStr Evolutionary models for insertions and deletions in a probabilistic modeling framework
title_full_unstemmed Evolutionary models for insertions and deletions in a probabilistic modeling framework
title_short Evolutionary models for insertions and deletions in a probabilistic modeling framework
title_sort evolutionary models for insertions and deletions in a probabilistic modeling framework
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1087829/
https://www.ncbi.nlm.nih.gov/pubmed/15780137
http://dx.doi.org/10.1186/1471-2105-6-63
work_keys_str_mv AT rivaselena evolutionarymodelsforinsertionsanddeletionsinaprobabilisticmodelingframework