Cargando…

Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods

BACKGROUND: Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time. Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, an...

Descripción completa

Detalles Bibliográficos
Autores principales: Duchene, Sebastian, Duchene, David A., Geoghegan, Jemma L., Dyson, Zoe A., Hawkey, Jane, Holt, Kathryn E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6006949/
https://www.ncbi.nlm.nih.gov/pubmed/29914372
http://dx.doi.org/10.1186/s12862-018-1210-5
_version_ 1783332948552974336
author Duchene, Sebastian
Duchene, David A.
Geoghegan, Jemma L.
Dyson, Zoe A.
Hawkey, Jane
Holt, Kathryn E.
author_facet Duchene, Sebastian
Duchene, David A.
Geoghegan, Jemma L.
Dyson, Zoe A.
Hawkey, Jane
Holt, Kathryn E.
author_sort Duchene, Sebastian
collection PubMed
description BACKGROUND: Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time. Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, and demographic parameters (such as population growth rates), while naturally integrating uncertainty in all parameters. Despite their desirable properties, Bayesian approaches can be computationally intensive, hindering their use for outbreak investigations involving genome data for a large numbers of pathogen isolates. An alternative to using full Bayesian inference is to use a hybrid approach, where the phylogenetic tree and evolutionary timescale are estimated first using maximum likelihood. Under this hybrid approach, demographic parameters are inferred from estimated trees instead of the sequence data, using maximum likelihood, Bayesian inference, or approximate Bayesian computation. This can vastly reduce the computational burden, but has the disadvantage of ignoring the uncertainty in the phylogenetic tree and evolutionary timescale. RESULTS: We compared the performance of a fully Bayesian and a hybrid method by analysing six whole-genome SNP data sets from a range of bacteria and simulations. The estimates from the two methods were very similar, suggesting that the hybrid method is a valid alternative for very large datasets. However, we also found that congruence between these methods is contingent on the presence of strong temporal structure in the data (i.e. clocklike behaviour), which is typically verified using a date-randomisation test in a Bayesian framework. To reduce the computational burden of this Bayesian test we implemented a date-randomisation test using a rapid maximum likelihood method, which has similar performance to its Bayesian counterpart. CONCLUSIONS: Hybrid approaches can produce reliable inferences of evolutionary timescales and phylodynamic parameters in a fraction of the time required for fully Bayesian analyses. As such, they are a valuable alternative in outbreak studies involving a large number of isolates. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12862-018-1210-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6006949
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-60069492018-06-26 Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods Duchene, Sebastian Duchene, David A. Geoghegan, Jemma L. Dyson, Zoe A. Hawkey, Jane Holt, Kathryn E. BMC Evol Biol Methodology Article BACKGROUND: Recent developments in sequencing technologies make it possible to obtain genome sequences from a large number of isolates in a very short time. Bayesian phylogenetic approaches can take advantage of these data by simultaneously inferring the phylogenetic tree, evolutionary timescale, and demographic parameters (such as population growth rates), while naturally integrating uncertainty in all parameters. Despite their desirable properties, Bayesian approaches can be computationally intensive, hindering their use for outbreak investigations involving genome data for a large numbers of pathogen isolates. An alternative to using full Bayesian inference is to use a hybrid approach, where the phylogenetic tree and evolutionary timescale are estimated first using maximum likelihood. Under this hybrid approach, demographic parameters are inferred from estimated trees instead of the sequence data, using maximum likelihood, Bayesian inference, or approximate Bayesian computation. This can vastly reduce the computational burden, but has the disadvantage of ignoring the uncertainty in the phylogenetic tree and evolutionary timescale. RESULTS: We compared the performance of a fully Bayesian and a hybrid method by analysing six whole-genome SNP data sets from a range of bacteria and simulations. The estimates from the two methods were very similar, suggesting that the hybrid method is a valid alternative for very large datasets. However, we also found that congruence between these methods is contingent on the presence of strong temporal structure in the data (i.e. clocklike behaviour), which is typically verified using a date-randomisation test in a Bayesian framework. To reduce the computational burden of this Bayesian test we implemented a date-randomisation test using a rapid maximum likelihood method, which has similar performance to its Bayesian counterpart. CONCLUSIONS: Hybrid approaches can produce reliable inferences of evolutionary timescales and phylodynamic parameters in a fraction of the time required for fully Bayesian analyses. As such, they are a valuable alternative in outbreak studies involving a large number of isolates. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12862-018-1210-5) contains supplementary material, which is available to authorized users. BioMed Central 2018-06-19 /pmc/articles/PMC6006949/ /pubmed/29914372 http://dx.doi.org/10.1186/s12862-018-1210-5 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Duchene, Sebastian
Duchene, David A.
Geoghegan, Jemma L.
Dyson, Zoe A.
Hawkey, Jane
Holt, Kathryn E.
Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
title Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
title_full Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
title_fullStr Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
title_full_unstemmed Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
title_short Inferring demographic parameters in bacterial genomic data using Bayesian and hybrid phylogenetic methods
title_sort inferring demographic parameters in bacterial genomic data using bayesian and hybrid phylogenetic methods
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6006949/
https://www.ncbi.nlm.nih.gov/pubmed/29914372
http://dx.doi.org/10.1186/s12862-018-1210-5
work_keys_str_mv AT duchenesebastian inferringdemographicparametersinbacterialgenomicdatausingbayesianandhybridphylogeneticmethods
AT duchenedavida inferringdemographicparametersinbacterialgenomicdatausingbayesianandhybridphylogeneticmethods
AT geogheganjemmal inferringdemographicparametersinbacterialgenomicdatausingbayesianandhybridphylogeneticmethods
AT dysonzoea inferringdemographicparametersinbacterialgenomicdatausingbayesianandhybridphylogeneticmethods
AT hawkeyjane inferringdemographicparametersinbacterialgenomicdatausingbayesianandhybridphylogeneticmethods
AT holtkathryne inferringdemographicparametersinbacterialgenomicdatausingbayesianandhybridphylogeneticmethods