Cargando…

Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations

Bayesian phylogeographic inference is a powerful tool in molecular epidemiological studies, which enables reconstruction of the origin and subsequent geographic spread of pathogens. Such inference is, however, potentially affected by geographic sampling bias. Here, we investigated the impact of samp...

Descripción completa

Detalles Bibliográficos
Autores principales: Layan, Maylis, Müller, Nicola F, Dellicour, Simon, De Maio, Nicola, Bourhy, Hervé, Cauchemez, Simon, Baele, Guy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9969415/
https://www.ncbi.nlm.nih.gov/pubmed/36860641
http://dx.doi.org/10.1093/ve/vead010
_version_ 1784897720469684224
author Layan, Maylis
Müller, Nicola F
Dellicour, Simon
De Maio, Nicola
Bourhy, Hervé
Cauchemez, Simon
Baele, Guy
author_facet Layan, Maylis
Müller, Nicola F
Dellicour, Simon
De Maio, Nicola
Bourhy, Hervé
Cauchemez, Simon
Baele, Guy
author_sort Layan, Maylis
collection PubMed
description Bayesian phylogeographic inference is a powerful tool in molecular epidemiological studies, which enables reconstruction of the origin and subsequent geographic spread of pathogens. Such inference is, however, potentially affected by geographic sampling bias. Here, we investigated the impact of sampling bias on the spatiotemporal reconstruction of viral epidemics using Bayesian discrete phylogeographic models and explored different operational strategies to mitigate this impact. We considered the continuous-time Markov chain (CTMC) model and two structured coalescent approximations (Bayesian structured coalescent approximation [BASTA] and marginal approximation of the structured coalescent [MASCOT]). For each approach, we compared the estimated and simulated spatiotemporal histories in biased and unbiased conditions based on the simulated epidemics of rabies virus (RABV) in dogs in Morocco. While the reconstructed spatiotemporal histories were impacted by sampling bias for the three approaches, BASTA and MASCOT reconstructions were also biased when employing unbiased samples. Increasing the number of analyzed genomes led to more robust estimates at low sampling bias for the CTMC model. Alternative sampling strategies that maximize the spatiotemporal coverage greatly improved the inference at intermediate sampling bias for the CTMC model, and to a lesser extent, for BASTA and MASCOT. In contrast, allowing for time-varying population sizes in MASCOT resulted in robust inference. We further applied these approaches to two empirical datasets: a RABV dataset from the Philippines and a SARS-CoV-2 dataset describing its early spread across the world. In conclusion, sampling biases are ubiquitous in phylogeographic analyses but may be accommodated by increasing the sample size, balancing spatial and temporal composition in the samples, and informing structured coalescent models with reliable case count data.
format Online
Article
Text
id pubmed-9969415
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-99694152023-02-28 Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations Layan, Maylis Müller, Nicola F Dellicour, Simon De Maio, Nicola Bourhy, Hervé Cauchemez, Simon Baele, Guy Virus Evol Research Article Bayesian phylogeographic inference is a powerful tool in molecular epidemiological studies, which enables reconstruction of the origin and subsequent geographic spread of pathogens. Such inference is, however, potentially affected by geographic sampling bias. Here, we investigated the impact of sampling bias on the spatiotemporal reconstruction of viral epidemics using Bayesian discrete phylogeographic models and explored different operational strategies to mitigate this impact. We considered the continuous-time Markov chain (CTMC) model and two structured coalescent approximations (Bayesian structured coalescent approximation [BASTA] and marginal approximation of the structured coalescent [MASCOT]). For each approach, we compared the estimated and simulated spatiotemporal histories in biased and unbiased conditions based on the simulated epidemics of rabies virus (RABV) in dogs in Morocco. While the reconstructed spatiotemporal histories were impacted by sampling bias for the three approaches, BASTA and MASCOT reconstructions were also biased when employing unbiased samples. Increasing the number of analyzed genomes led to more robust estimates at low sampling bias for the CTMC model. Alternative sampling strategies that maximize the spatiotemporal coverage greatly improved the inference at intermediate sampling bias for the CTMC model, and to a lesser extent, for BASTA and MASCOT. In contrast, allowing for time-varying population sizes in MASCOT resulted in robust inference. We further applied these approaches to two empirical datasets: a RABV dataset from the Philippines and a SARS-CoV-2 dataset describing its early spread across the world. In conclusion, sampling biases are ubiquitous in phylogeographic analyses but may be accommodated by increasing the sample size, balancing spatial and temporal composition in the samples, and informing structured coalescent models with reliable case count data. Oxford University Press 2023-02-06 /pmc/articles/PMC9969415/ /pubmed/36860641 http://dx.doi.org/10.1093/ve/vead010 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research Article
Layan, Maylis
Müller, Nicola F
Dellicour, Simon
De Maio, Nicola
Bourhy, Hervé
Cauchemez, Simon
Baele, Guy
Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations
title Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations
title_full Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations
title_fullStr Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations
title_full_unstemmed Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations
title_short Impact and mitigation of sampling bias to determine viral spread: Evaluating discrete phylogeography through CTMC modeling and structured coalescent model approximations
title_sort impact and mitigation of sampling bias to determine viral spread: evaluating discrete phylogeography through ctmc modeling and structured coalescent model approximations
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9969415/
https://www.ncbi.nlm.nih.gov/pubmed/36860641
http://dx.doi.org/10.1093/ve/vead010
work_keys_str_mv AT layanmaylis impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations
AT mullernicolaf impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations
AT dellicoursimon impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations
AT demaionicola impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations
AT bourhyherve impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations
AT cauchemezsimon impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations
AT baeleguy impactandmitigationofsamplingbiastodetermineviralspreadevaluatingdiscretephylogeographythroughctmcmodelingandstructuredcoalescentmodelapproximations