Cargando…

Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees

General linear models have been the foundational statistical framework used to discover the ecological processes that explain the distribution and abundance of natural populations. Analyses of the rapidly expanding cache of environmental and ecological data, however, require advanced statistical met...

Descripción completa

Detalles Bibliográficos
Autores principales: Manley, William, Tran, Tam, Prusinski, Melissa, Brisson, Dustin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10054924/
https://www.ncbi.nlm.nih.gov/pubmed/36993623
http://dx.doi.org/10.1101/2023.03.13.532443
_version_ 1785015788920373248
author Manley, William
Tran, Tam
Prusinski, Melissa
Brisson, Dustin
author_facet Manley, William
Tran, Tam
Prusinski, Melissa
Brisson, Dustin
author_sort Manley, William
collection PubMed
description General linear models have been the foundational statistical framework used to discover the ecological processes that explain the distribution and abundance of natural populations. Analyses of the rapidly expanding cache of environmental and ecological data, however, require advanced statistical methods to contend with complexities inherent to extremely large natural data sets. Modern machine learning frameworks such as gradient boosted trees efficiently identify complex ecological relationships in massive data sets, which are expected to result in accurate predictions of the distribution and abundance of organisms in nature. However, rigorous assessments of the theoretical advantages of these methodologies on natural data sets are rare. Here we compare the abilities of gradient boosted and linear models to identify environmental features that explain observed variations in the distribution and abundance of blacklegged tick (Ixodes scapularis) populations in a data set collected across New York State over a ten-year period. The gradient boosted and linear models use similar environmental features to explain tick demography, although the gradient boosted models found non-linear relationships and interactions that are difficult to anticipate and often impractical to identify with a linear modeling framework. Further, the gradient boosted models predicted the distribution and abundance of ticks in years and areas beyond the training data with much greater accuracy than their linear model counterparts. The flexible gradient boosting framework also permitted additional model types that provide practical advantages for tick surveillance and public health. The results highlight the potential of gradient boosted models to discover novel ecological phenomena affecting pathogen demography and as a powerful public health tool to mitigate disease risks.
format Online
Article
Text
id pubmed-10054924
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-100549242023-03-30 Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees Manley, William Tran, Tam Prusinski, Melissa Brisson, Dustin bioRxiv Article General linear models have been the foundational statistical framework used to discover the ecological processes that explain the distribution and abundance of natural populations. Analyses of the rapidly expanding cache of environmental and ecological data, however, require advanced statistical methods to contend with complexities inherent to extremely large natural data sets. Modern machine learning frameworks such as gradient boosted trees efficiently identify complex ecological relationships in massive data sets, which are expected to result in accurate predictions of the distribution and abundance of organisms in nature. However, rigorous assessments of the theoretical advantages of these methodologies on natural data sets are rare. Here we compare the abilities of gradient boosted and linear models to identify environmental features that explain observed variations in the distribution and abundance of blacklegged tick (Ixodes scapularis) populations in a data set collected across New York State over a ten-year period. The gradient boosted and linear models use similar environmental features to explain tick demography, although the gradient boosted models found non-linear relationships and interactions that are difficult to anticipate and often impractical to identify with a linear modeling framework. Further, the gradient boosted models predicted the distribution and abundance of ticks in years and areas beyond the training data with much greater accuracy than their linear model counterparts. The flexible gradient boosting framework also permitted additional model types that provide practical advantages for tick surveillance and public health. The results highlight the potential of gradient boosted models to discover novel ecological phenomena affecting pathogen demography and as a powerful public health tool to mitigate disease risks. Cold Spring Harbor Laboratory 2023-08-29 /pmc/articles/PMC10054924/ /pubmed/36993623 http://dx.doi.org/10.1101/2023.03.13.532443 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.
spellingShingle Article
Manley, William
Tran, Tam
Prusinski, Melissa
Brisson, Dustin
Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees
title Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees
title_full Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees
title_fullStr Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees
title_full_unstemmed Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees
title_short Modeling Tick Populations: An Ecological Test Case for Gradient Boosted Trees
title_sort modeling tick populations: an ecological test case for gradient boosted trees
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10054924/
https://www.ncbi.nlm.nih.gov/pubmed/36993623
http://dx.doi.org/10.1101/2023.03.13.532443
work_keys_str_mv AT manleywilliam modelingtickpopulationsanecologicaltestcaseforgradientboostedtrees
AT trantam modelingtickpopulationsanecologicaltestcaseforgradientboostedtrees
AT prusinskimelissa modelingtickpopulationsanecologicaltestcaseforgradientboostedtrees
AT brissondustin modelingtickpopulationsanecologicaltestcaseforgradientboostedtrees