Cargando…

Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale

1. Insect populations are changing rapidly, and monitoring these changes is essential for understanding the causes and consequences of such shifts. However, large‐scale insect identification projects are time‐consuming and expensive when done solely by human identifiers. Machine learning offers a po...

Descripción completa

Detalles Bibliográficos
Autores principales: Blair, Jarrett, Weiser, Michael D., Kaspari, Michael, Miller, Matthew, Siler, Cameron, Marshall, Katie E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7713910/
https://www.ncbi.nlm.nih.gov/pubmed/33304524
http://dx.doi.org/10.1002/ece3.6905
_version_ 1783618641083760640
author Blair, Jarrett
Weiser, Michael D.
Kaspari, Michael
Miller, Matthew
Siler, Cameron
Marshall, Katie E.
author_facet Blair, Jarrett
Weiser, Michael D.
Kaspari, Michael
Miller, Matthew
Siler, Cameron
Marshall, Katie E.
author_sort Blair, Jarrett
collection PubMed
description 1. Insect populations are changing rapidly, and monitoring these changes is essential for understanding the causes and consequences of such shifts. However, large‐scale insect identification projects are time‐consuming and expensive when done solely by human identifiers. Machine learning offers a possible solution to help collect insect data quickly and efficiently. 2. Here, we outline a methodology for training classification models to identify pitfall trap‐collected insects from image data and then apply the method to identify ground beetles (Carabidae). All beetles were collected by the National Ecological Observatory Network (NEON), a continental scale ecological monitoring project with sites across the United States. We describe the procedures for image collection, image data extraction, data preparation, and model training, and compare the performance of five machine learning algorithms and two classification methods (hierarchical vs. single‐level) identifying ground beetles from the species to subfamily level. All models were trained using pre‐extracted feature vectors, not raw image data. Our methodology allows for data to be extracted from multiple individuals within the same image thus enhancing time efficiency, utilizes relatively simple models that allow for direct assessment of model performance, and can be performed on relatively small datasets. 3. The best performing algorithm, linear discriminant analysis (LDA), reached an accuracy of 84.6% at the species level when naively identifying species, which was further increased to >95% when classifications were limited by known local species pools. Model performance was negatively correlated with taxonomic specificity, with the LDA model reaching an accuracy of ~99% at the subfamily level. When classifying carabid species not included in the training dataset at higher taxonomic levels species, the models performed significantly better than if classifications were made randomly. We also observed greater performance when classifications were made using the hierarchical classification method compared to the single‐level classification method at higher taxonomic levels. 4. The general methodology outlined here serves as a proof‐of‐concept for classifying pitfall trap‐collected organisms using machine learning algorithms, and the image data extraction methodology may be used for nonmachine learning uses. We propose that integration of machine learning in large‐scale identification pipelines will increase efficiency and lead to a greater flow of insect macroecological data, with the potential to be expanded for use with other noninsect taxa.
format Online
Article
Text
id pubmed-7713910
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-77139102020-12-09 Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale Blair, Jarrett Weiser, Michael D. Kaspari, Michael Miller, Matthew Siler, Cameron Marshall, Katie E. Ecol Evol Original Research 1. Insect populations are changing rapidly, and monitoring these changes is essential for understanding the causes and consequences of such shifts. However, large‐scale insect identification projects are time‐consuming and expensive when done solely by human identifiers. Machine learning offers a possible solution to help collect insect data quickly and efficiently. 2. Here, we outline a methodology for training classification models to identify pitfall trap‐collected insects from image data and then apply the method to identify ground beetles (Carabidae). All beetles were collected by the National Ecological Observatory Network (NEON), a continental scale ecological monitoring project with sites across the United States. We describe the procedures for image collection, image data extraction, data preparation, and model training, and compare the performance of five machine learning algorithms and two classification methods (hierarchical vs. single‐level) identifying ground beetles from the species to subfamily level. All models were trained using pre‐extracted feature vectors, not raw image data. Our methodology allows for data to be extracted from multiple individuals within the same image thus enhancing time efficiency, utilizes relatively simple models that allow for direct assessment of model performance, and can be performed on relatively small datasets. 3. The best performing algorithm, linear discriminant analysis (LDA), reached an accuracy of 84.6% at the species level when naively identifying species, which was further increased to >95% when classifications were limited by known local species pools. Model performance was negatively correlated with taxonomic specificity, with the LDA model reaching an accuracy of ~99% at the subfamily level. When classifying carabid species not included in the training dataset at higher taxonomic levels species, the models performed significantly better than if classifications were made randomly. We also observed greater performance when classifications were made using the hierarchical classification method compared to the single‐level classification method at higher taxonomic levels. 4. The general methodology outlined here serves as a proof‐of‐concept for classifying pitfall trap‐collected organisms using machine learning algorithms, and the image data extraction methodology may be used for nonmachine learning uses. We propose that integration of machine learning in large‐scale identification pipelines will increase efficiency and lead to a greater flow of insect macroecological data, with the potential to be expanded for use with other noninsect taxa. John Wiley and Sons Inc. 2020-11-11 /pmc/articles/PMC7713910/ /pubmed/33304524 http://dx.doi.org/10.1002/ece3.6905 Text en © 2020 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd. This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Research
Blair, Jarrett
Weiser, Michael D.
Kaspari, Michael
Miller, Matthew
Siler, Cameron
Marshall, Katie E.
Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
title Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
title_full Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
title_fullStr Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
title_full_unstemmed Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
title_short Robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
title_sort robust and simplified machine learning identification of pitfall trap‐collected ground beetles at the continental scale
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7713910/
https://www.ncbi.nlm.nih.gov/pubmed/33304524
http://dx.doi.org/10.1002/ece3.6905
work_keys_str_mv AT blairjarrett robustandsimplifiedmachinelearningidentificationofpitfalltrapcollectedgroundbeetlesatthecontinentalscale
AT weisermichaeld robustandsimplifiedmachinelearningidentificationofpitfalltrapcollectedgroundbeetlesatthecontinentalscale
AT kasparimichael robustandsimplifiedmachinelearningidentificationofpitfalltrapcollectedgroundbeetlesatthecontinentalscale
AT millermatthew robustandsimplifiedmachinelearningidentificationofpitfalltrapcollectedgroundbeetlesatthecontinentalscale
AT silercameron robustandsimplifiedmachinelearningidentificationofpitfalltrapcollectedgroundbeetlesatthecontinentalscale
AT marshallkatiee robustandsimplifiedmachinelearningidentificationofpitfalltrapcollectedgroundbeetlesatthecontinentalscale