Cargando…

Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences

Estimating past population dynamics from molecular sequences that have been sampled longitudinally through time is an important problem in infectious disease epidemiology, molecular ecology, and macroevolution. Popular solutions, such as the skyline and skygrid methods, infer past effective populati...

Descripción completa

Detalles Bibliográficos
Autores principales: Parag, Kris V, du Plessis, Louis, Pybus, Oliver G
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7403618/
https://www.ncbi.nlm.nih.gov/pubmed/32003829
http://dx.doi.org/10.1093/molbev/msaa016
_version_ 1783566977543962624
author Parag, Kris V
du Plessis, Louis
Pybus, Oliver G
author_facet Parag, Kris V
du Plessis, Louis
Pybus, Oliver G
author_sort Parag, Kris V
collection PubMed
description Estimating past population dynamics from molecular sequences that have been sampled longitudinally through time is an important problem in infectious disease epidemiology, molecular ecology, and macroevolution. Popular solutions, such as the skyline and skygrid methods, infer past effective population sizes from the coalescent event times of phylogenies reconstructed from sampled sequences but assume that sequence sampling times are uninformative about population size changes. Recent work has started to question this assumption by exploring how sampling time information can aid coalescent inference. Here, we develop, investigate, and implement a new skyline method, termed the epoch sampling skyline plot (ESP), to jointly estimate the dynamics of population size and sampling rate through time. The ESP is inspired by real-world data collection practices and comprises a flexible model in which the sequence sampling rate is proportional to the population size within an epoch but can change discontinuously between epochs. We show that the ESP is accurate under several realistic sampling protocols and we prove analytically that it can at least double the best precision achievable by standard approaches. We generalize the ESP to incorporate phylogenetic uncertainty in a new Bayesian package (BESP) in BEAST2. We re-examine two well-studied empirical data sets from virus epidemiology and molecular evolution and find that the BESP improves upon previous coalescent estimators and generates new, biologically useful insights into the sampling protocols underpinning these data sets. Sequence sampling times provide a rich source of information for coalescent inference that will become increasingly important as sequence collection intensifies and becomes more formalized.
format Online
Article
Text
id pubmed-7403618
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-74036182020-08-07 Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences Parag, Kris V du Plessis, Louis Pybus, Oliver G Mol Biol Evol Methods Estimating past population dynamics from molecular sequences that have been sampled longitudinally through time is an important problem in infectious disease epidemiology, molecular ecology, and macroevolution. Popular solutions, such as the skyline and skygrid methods, infer past effective population sizes from the coalescent event times of phylogenies reconstructed from sampled sequences but assume that sequence sampling times are uninformative about population size changes. Recent work has started to question this assumption by exploring how sampling time information can aid coalescent inference. Here, we develop, investigate, and implement a new skyline method, termed the epoch sampling skyline plot (ESP), to jointly estimate the dynamics of population size and sampling rate through time. The ESP is inspired by real-world data collection practices and comprises a flexible model in which the sequence sampling rate is proportional to the population size within an epoch but can change discontinuously between epochs. We show that the ESP is accurate under several realistic sampling protocols and we prove analytically that it can at least double the best precision achievable by standard approaches. We generalize the ESP to incorporate phylogenetic uncertainty in a new Bayesian package (BESP) in BEAST2. We re-examine two well-studied empirical data sets from virus epidemiology and molecular evolution and find that the BESP improves upon previous coalescent estimators and generates new, biologically useful insights into the sampling protocols underpinning these data sets. Sequence sampling times provide a rich source of information for coalescent inference that will become increasingly important as sequence collection intensifies and becomes more formalized. Oxford University Press 2020-08 2020-01-31 /pmc/articles/PMC7403618/ /pubmed/32003829 http://dx.doi.org/10.1093/molbev/msaa016 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Parag, Kris V
du Plessis, Louis
Pybus, Oliver G
Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences
title Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences
title_full Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences
title_fullStr Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences
title_full_unstemmed Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences
title_short Jointly Inferring the Dynamics of Population Size and Sampling Intensity from Molecular Sequences
title_sort jointly inferring the dynamics of population size and sampling intensity from molecular sequences
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7403618/
https://www.ncbi.nlm.nih.gov/pubmed/32003829
http://dx.doi.org/10.1093/molbev/msaa016
work_keys_str_mv AT paragkrisv jointlyinferringthedynamicsofpopulationsizeandsamplingintensityfrommolecularsequences
AT duplessislouis jointlyinferringthedynamicsofpopulationsizeandsamplingintensityfrommolecularsequences
AT pybusoliverg jointlyinferringthedynamicsofpopulationsizeandsamplingintensityfrommolecularsequences