Cargando…

Predicting insect outbreaks using machine learning: A mountain pine beetle case study

Planning forest management relies on predicting insect outbreaks such as mountain pine beetle, particularly in the intermediate‐term future, e.g., 5‐year. Machine‐learning algorithms are potential solutions to this challenging problem due to their many successes across a variety of prediction tasks....

Descripción completa

Detalles Bibliográficos
Autores principales: Ramazi, Pouria, Kunegel‐Lion, Mélodie, Greiner, Russell, Lewis, Mark A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8495826/
https://www.ncbi.nlm.nih.gov/pubmed/34646449
http://dx.doi.org/10.1002/ece3.7921
_version_ 1784579627843321856
author Ramazi, Pouria
Kunegel‐Lion, Mélodie
Greiner, Russell
Lewis, Mark A.
author_facet Ramazi, Pouria
Kunegel‐Lion, Mélodie
Greiner, Russell
Lewis, Mark A.
author_sort Ramazi, Pouria
collection PubMed
description Planning forest management relies on predicting insect outbreaks such as mountain pine beetle, particularly in the intermediate‐term future, e.g., 5‐year. Machine‐learning algorithms are potential solutions to this challenging problem due to their many successes across a variety of prediction tasks. However, there are many subtle challenges in applying them: identifying the best learning models and the best subset of available covariates (including time lags) and properly evaluating the models to avoid misleading performance‐measures. We systematically address these issues in predicting the chance of a mountain pine beetle outbreak in the Cypress Hills area and seek models with the best performance at predicting future 1‐, 3‐, 5‐ and 7‐year infestations. We train nine machine‐learning models, including two generalized boosted regression trees (GBM) that predict future 1‐ and 3‐year infestations with 92% and 88% AUC, and two novel mixed models that predict future 5‐ and 7‐year infestations with 86% and 84% AUC, respectively. We also consider forming the train and test datasets by splitting the original dataset randomly rather than using the appropriate year‐based approach and show that this may obtain models that score high on the test dataset but low in practice, resulting in inaccurate performance evaluations. For example, a k‐nearest neighbor model with the actual performance of 68% AUC, scores the misleadingly high 78% on a test dataset obtained from a random split, but the more accurate 66% on a year‐based split. We then investigate how the prediction accuracy varies with respect to the provided history length of the covariates and find that neural network and naive Bayes, predict more accurately as history‐length increases, particularly for future 1‐ and 3‐year predictions, and roughly the same holds with GBM. Our approach is applicable to other invasive species. The resulting predictors can be used in planning forest and pest management and planning sampling locations in field studies.
format Online
Article
Text
id pubmed-8495826
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-84958262021-10-12 Predicting insect outbreaks using machine learning: A mountain pine beetle case study Ramazi, Pouria Kunegel‐Lion, Mélodie Greiner, Russell Lewis, Mark A. Ecol Evol Original Research Planning forest management relies on predicting insect outbreaks such as mountain pine beetle, particularly in the intermediate‐term future, e.g., 5‐year. Machine‐learning algorithms are potential solutions to this challenging problem due to their many successes across a variety of prediction tasks. However, there are many subtle challenges in applying them: identifying the best learning models and the best subset of available covariates (including time lags) and properly evaluating the models to avoid misleading performance‐measures. We systematically address these issues in predicting the chance of a mountain pine beetle outbreak in the Cypress Hills area and seek models with the best performance at predicting future 1‐, 3‐, 5‐ and 7‐year infestations. We train nine machine‐learning models, including two generalized boosted regression trees (GBM) that predict future 1‐ and 3‐year infestations with 92% and 88% AUC, and two novel mixed models that predict future 5‐ and 7‐year infestations with 86% and 84% AUC, respectively. We also consider forming the train and test datasets by splitting the original dataset randomly rather than using the appropriate year‐based approach and show that this may obtain models that score high on the test dataset but low in practice, resulting in inaccurate performance evaluations. For example, a k‐nearest neighbor model with the actual performance of 68% AUC, scores the misleadingly high 78% on a test dataset obtained from a random split, but the more accurate 66% on a year‐based split. We then investigate how the prediction accuracy varies with respect to the provided history length of the covariates and find that neural network and naive Bayes, predict more accurately as history‐length increases, particularly for future 1‐ and 3‐year predictions, and roughly the same holds with GBM. Our approach is applicable to other invasive species. The resulting predictors can be used in planning forest and pest management and planning sampling locations in field studies. John Wiley and Sons Inc. 2021-09-12 /pmc/articles/PMC8495826/ /pubmed/34646449 http://dx.doi.org/10.1002/ece3.7921 Text en © 2021 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Ramazi, Pouria
Kunegel‐Lion, Mélodie
Greiner, Russell
Lewis, Mark A.
Predicting insect outbreaks using machine learning: A mountain pine beetle case study
title Predicting insect outbreaks using machine learning: A mountain pine beetle case study
title_full Predicting insect outbreaks using machine learning: A mountain pine beetle case study
title_fullStr Predicting insect outbreaks using machine learning: A mountain pine beetle case study
title_full_unstemmed Predicting insect outbreaks using machine learning: A mountain pine beetle case study
title_short Predicting insect outbreaks using machine learning: A mountain pine beetle case study
title_sort predicting insect outbreaks using machine learning: a mountain pine beetle case study
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8495826/
https://www.ncbi.nlm.nih.gov/pubmed/34646449
http://dx.doi.org/10.1002/ece3.7921
work_keys_str_mv AT ramazipouria predictinginsectoutbreaksusingmachinelearningamountainpinebeetlecasestudy
AT kunegellionmelodie predictinginsectoutbreaksusingmachinelearningamountainpinebeetlecasestudy
AT greinerrussell predictinginsectoutbreaksusingmachinelearningamountainpinebeetlecasestudy
AT lewismarka predictinginsectoutbreaksusingmachinelearningamountainpinebeetlecasestudy