Cargando…

State aggregation for fast likelihood computations in molecular evolution

MOTIVATION: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large size of the state space of the Markov processes used to model codon evolution makes it diffi...

Descripción completa

Detalles Bibliográficos
Autores principales: Davydov, Iakov I, Robinson-Rechavi, Marc, Salamin, Nicolas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408795/
https://www.ncbi.nlm.nih.gov/pubmed/28172542
http://dx.doi.org/10.1093/bioinformatics/btw632
_version_ 1783232366117912576
author Davydov, Iakov I
Robinson-Rechavi, Marc
Salamin, Nicolas
author_facet Davydov, Iakov I
Robinson-Rechavi, Marc
Salamin, Nicolas
author_sort Davydov, Iakov I
collection PubMed
description MOTIVATION: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large size of the state space of the Markov processes used to model codon evolution makes it difficult to use these models with large biological datasets. We propose here to use state aggregation to reduce the state space of codon models and, thus, improve the computational performance of likelihood estimation on these models. RESULTS: We show that this heuristic speeds up the computations of the M0 and branch-site models up to 6.8 times. We also show through simulations that state aggregation does not introduce a detectable bias. We analyzed a real dataset and show that aggregation provides highly correlated predictions compared to the full likelihood computations. Finally, state aggregation is a very general approach and can be applied to any continuous-time Markov process-based model with large state space, such as amino acid and coevolution models. We therefore discuss different ways to apply state aggregation to Markov models used in phylogenetics. AVAILABILITY AND IMPLEMENTATION: The heuristic is implemented in the godon package (https://bitbucket.org/Davydov/godon) and in a version of FastCodeML (https://gitlab.isb-sib.ch/phylo/fastcodeml). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-5408795
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-54087952017-05-03 State aggregation for fast likelihood computations in molecular evolution Davydov, Iakov I Robinson-Rechavi, Marc Salamin, Nicolas Bioinformatics Original Papers MOTIVATION: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large size of the state space of the Markov processes used to model codon evolution makes it difficult to use these models with large biological datasets. We propose here to use state aggregation to reduce the state space of codon models and, thus, improve the computational performance of likelihood estimation on these models. RESULTS: We show that this heuristic speeds up the computations of the M0 and branch-site models up to 6.8 times. We also show through simulations that state aggregation does not introduce a detectable bias. We analyzed a real dataset and show that aggregation provides highly correlated predictions compared to the full likelihood computations. Finally, state aggregation is a very general approach and can be applied to any continuous-time Markov process-based model with large state space, such as amino acid and coevolution models. We therefore discuss different ways to apply state aggregation to Markov models used in phylogenetics. AVAILABILITY AND IMPLEMENTATION: The heuristic is implemented in the godon package (https://bitbucket.org/Davydov/godon) and in a version of FastCodeML (https://gitlab.isb-sib.ch/phylo/fastcodeml). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2017-02-01 2016-10-02 /pmc/articles/PMC5408795/ /pubmed/28172542 http://dx.doi.org/10.1093/bioinformatics/btw632 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Davydov, Iakov I
Robinson-Rechavi, Marc
Salamin, Nicolas
State aggregation for fast likelihood computations in molecular evolution
title State aggregation for fast likelihood computations in molecular evolution
title_full State aggregation for fast likelihood computations in molecular evolution
title_fullStr State aggregation for fast likelihood computations in molecular evolution
title_full_unstemmed State aggregation for fast likelihood computations in molecular evolution
title_short State aggregation for fast likelihood computations in molecular evolution
title_sort state aggregation for fast likelihood computations in molecular evolution
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5408795/
https://www.ncbi.nlm.nih.gov/pubmed/28172542
http://dx.doi.org/10.1093/bioinformatics/btw632
work_keys_str_mv AT davydoviakovi stateaggregationforfastlikelihoodcomputationsinmolecularevolution
AT robinsonrechavimarc stateaggregationforfastlikelihoodcomputationsinmolecularevolution
AT salaminnicolas stateaggregationforfastlikelihoodcomputationsinmolecularevolution