Cargando…

A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies

Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distrib...

Descripción completa

Detalles Bibliográficos
Autores principales: Antoneli, Fernando, Passos, Fernando M., Lopes, Luciano R., Briones, Marcelo R. S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5754089/
https://www.ncbi.nlm.nih.gov/pubmed/29300759
http://dx.doi.org/10.1371/journal.pone.0190826
_version_ 1783290368185335808
author Antoneli, Fernando
Passos, Fernando M.
Lopes, Luciano R.
Briones, Marcelo R. S.
author_facet Antoneli, Fernando
Passos, Fernando M.
Lopes, Luciano R.
Briones, Marcelo R. S.
author_sort Antoneli, Fernando
collection PubMed
description Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power.
format Online
Article
Text
id pubmed-5754089
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-57540892018-01-26 A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies Antoneli, Fernando Passos, Fernando M. Lopes, Luciano R. Briones, Marcelo R. S. PLoS One Research Article Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power. Public Library of Science 2018-01-04 /pmc/articles/PMC5754089/ /pubmed/29300759 http://dx.doi.org/10.1371/journal.pone.0190826 Text en © 2018 Antoneli et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Antoneli, Fernando
Passos, Fernando M.
Lopes, Luciano R.
Briones, Marcelo R. S.
A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
title A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
title_full A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
title_fullStr A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
title_full_unstemmed A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
title_short A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
title_sort kolmogorov-smirnov test for the molecular clock based on bayesian ensembles of phylogenies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5754089/
https://www.ncbi.nlm.nih.gov/pubmed/29300759
http://dx.doi.org/10.1371/journal.pone.0190826
work_keys_str_mv AT antonelifernando akolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT passosfernandom akolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT lopeslucianor akolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT brionesmarcelors akolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT antonelifernando kolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT passosfernandom kolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT lopeslucianor kolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies
AT brionesmarcelors kolmogorovsmirnovtestforthemolecularclockbasedonbayesianensemblesofphylogenies