Cargando…

Optimising predictive models to prioritise viral discovery in zoonotic reservoirs

Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain...

Descripción completa

Detalles Bibliográficos
Autores principales: Becker, Daniel J, Albery, Gregory F, Sjodin, Anna R, Poisot, Timothée, Bergner, Laura M, Chen, Binqi, Cohen, Lily E, Dallas, Tad A, Eskew, Evan A, Fagre, Anna C, Farrell, Maxwell J, Guth, Sarah, Han, Barbara A, Simmons, Nancy B, Stock, Michiel, Teeling, Emma C, Carlson, Colin J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Authors. Published by Elsevier Ltd. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8747432/
https://www.ncbi.nlm.nih.gov/pubmed/35036970
http://dx.doi.org/10.1016/S2666-5247(21)00245-7
_version_ 1784630835142459392
author Becker, Daniel J
Albery, Gregory F
Sjodin, Anna R
Poisot, Timothée
Bergner, Laura M
Chen, Binqi
Cohen, Lily E
Dallas, Tad A
Eskew, Evan A
Fagre, Anna C
Farrell, Maxwell J
Guth, Sarah
Han, Barbara A
Simmons, Nancy B
Stock, Michiel
Teeling, Emma C
Carlson, Colin J
author_facet Becker, Daniel J
Albery, Gregory F
Sjodin, Anna R
Poisot, Timothée
Bergner, Laura M
Chen, Binqi
Cohen, Lily E
Dallas, Tad A
Eskew, Evan A
Fagre, Anna C
Farrell, Maxwell J
Guth, Sarah
Han, Barbara A
Simmons, Nancy B
Stock, Michiel
Teeling, Emma C
Carlson, Colin J
author_sort Becker, Daniel J
collection PubMed
description Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host–virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.
format Online
Article
Text
id pubmed-8747432
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Authors. Published by Elsevier Ltd.
record_format MEDLINE/PubMed
spelling pubmed-87474322022-01-11 Optimising predictive models to prioritise viral discovery in zoonotic reservoirs Becker, Daniel J Albery, Gregory F Sjodin, Anna R Poisot, Timothée Bergner, Laura M Chen, Binqi Cohen, Lily E Dallas, Tad A Eskew, Evan A Fagre, Anna C Farrell, Maxwell J Guth, Sarah Han, Barbara A Simmons, Nancy B Stock, Michiel Teeling, Emma C Carlson, Colin J Lancet Microbe Review Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host–virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating. The Authors. Published by Elsevier Ltd. 2022-08 2022-01-10 /pmc/articles/PMC8747432/ /pubmed/35036970 http://dx.doi.org/10.1016/S2666-5247(21)00245-7 Text en © 2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Review
Becker, Daniel J
Albery, Gregory F
Sjodin, Anna R
Poisot, Timothée
Bergner, Laura M
Chen, Binqi
Cohen, Lily E
Dallas, Tad A
Eskew, Evan A
Fagre, Anna C
Farrell, Maxwell J
Guth, Sarah
Han, Barbara A
Simmons, Nancy B
Stock, Michiel
Teeling, Emma C
Carlson, Colin J
Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
title Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
title_full Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
title_fullStr Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
title_full_unstemmed Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
title_short Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
title_sort optimising predictive models to prioritise viral discovery in zoonotic reservoirs
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8747432/
https://www.ncbi.nlm.nih.gov/pubmed/35036970
http://dx.doi.org/10.1016/S2666-5247(21)00245-7
work_keys_str_mv AT beckerdanielj optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT alberygregoryf optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT sjodinannar optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT poisottimothee optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT bergnerlauram optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT chenbinqi optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT cohenlilye optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT dallastada optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT eskewevana optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT fagreannac optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT farrellmaxwellj optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT guthsarah optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT hanbarbaraa optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT simmonsnancyb optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT stockmichiel optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT teelingemmac optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs
AT carlsoncolinj optimisingpredictivemodelstoprioritiseviraldiscoveryinzoonoticreservoirs