Cargando…

A Model of Indel Evolution by Finite-State, Continuous-Time Machines

We introduce a systematic method of approximating finite-time transition probabilities for continuous-time insertion-deletion models on sequences. The method uses automata theory to describe the action of an infinitesimal evolutionary generator on a probability distribution over alignments, where bo...

Descripción completa

Detalles Bibliográficos
Autor principal: Holmes, Ian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Genetics Society of America 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7768254/
https://www.ncbi.nlm.nih.gov/pubmed/33020189
http://dx.doi.org/10.1534/genetics.120.303630
_version_ 1783629118312546304
author Holmes, Ian
author_facet Holmes, Ian
author_sort Holmes, Ian
collection PubMed
description We introduce a systematic method of approximating finite-time transition probabilities for continuous-time insertion-deletion models on sequences. The method uses automata theory to describe the action of an infinitesimal evolutionary generator on a probability distribution over alignments, where both the generator and the alignment distribution can be represented by pair hidden Markov models (HMMs). In general, combining HMMs in this way induces a multiplication of their state spaces; to control this, we introduce a coarse-graining operation to keep the state space at a constant size. This leads naturally to ordinary differential equations for the evolution of the transition probabilities of the approximating pair HMM. The TKF91 model emerges as an exact solution to these equations for the special case of single-residue indels. For the more general case of multiple-residue indels, the equations can be solved by numerical integration. Using simulated data, we show that the resulting distribution over alignments, when compared to previous approximations, is a better fit over a broader range of parameters. We also propose a related approach to develop differential equations for sufficient statistics to estimate the underlying instantaneous indel rates by expectation maximization. Our code and data are available at https://github.com/ihh/trajectory-likelihood.
format Online
Article
Text
id pubmed-7768254
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Genetics Society of America
record_format MEDLINE/PubMed
spelling pubmed-77682542021-01-11 A Model of Indel Evolution by Finite-State, Continuous-Time Machines Holmes, Ian Genetics Investigations We introduce a systematic method of approximating finite-time transition probabilities for continuous-time insertion-deletion models on sequences. The method uses automata theory to describe the action of an infinitesimal evolutionary generator on a probability distribution over alignments, where both the generator and the alignment distribution can be represented by pair hidden Markov models (HMMs). In general, combining HMMs in this way induces a multiplication of their state spaces; to control this, we introduce a coarse-graining operation to keep the state space at a constant size. This leads naturally to ordinary differential equations for the evolution of the transition probabilities of the approximating pair HMM. The TKF91 model emerges as an exact solution to these equations for the special case of single-residue indels. For the more general case of multiple-residue indels, the equations can be solved by numerical integration. Using simulated data, we show that the resulting distribution over alignments, when compared to previous approximations, is a better fit over a broader range of parameters. We also propose a related approach to develop differential equations for sufficient statistics to estimate the underlying instantaneous indel rates by expectation maximization. Our code and data are available at https://github.com/ihh/trajectory-likelihood. Genetics Society of America 2020-12 2020-10-05 /pmc/articles/PMC7768254/ /pubmed/33020189 http://dx.doi.org/10.1534/genetics.120.303630 Text en Copyright © 2020 Holmes by the Genetics Society of America Available freely online through the author-supported open access option. This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Investigations
Holmes, Ian
A Model of Indel Evolution by Finite-State, Continuous-Time Machines
title A Model of Indel Evolution by Finite-State, Continuous-Time Machines
title_full A Model of Indel Evolution by Finite-State, Continuous-Time Machines
title_fullStr A Model of Indel Evolution by Finite-State, Continuous-Time Machines
title_full_unstemmed A Model of Indel Evolution by Finite-State, Continuous-Time Machines
title_short A Model of Indel Evolution by Finite-State, Continuous-Time Machines
title_sort model of indel evolution by finite-state, continuous-time machines
topic Investigations
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7768254/
https://www.ncbi.nlm.nih.gov/pubmed/33020189
http://dx.doi.org/10.1534/genetics.120.303630
work_keys_str_mv AT holmesian amodelofindelevolutionbyfinitestatecontinuoustimemachines
AT holmesian modelofindelevolutionbyfinitestatecontinuoustimemachines