Cargando…

Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images

Community science image libraries offer a massive, but largely untapped, source of observational data for phenological research. The iNaturalist platform offers a particularly rich archive, containing more than 49 million verifiable, georeferenced, open access images, encompassing seven continents a...

Descripción completa

Detalles Bibliográficos
Autores principales: Reeb, Rachel A., Aziz, Naeem, Lapp, Samuel M., Kitzes, Justin, Heberling, J. Mason, Kuebbing, Sara E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8801702/
https://www.ncbi.nlm.nih.gov/pubmed/35111176
http://dx.doi.org/10.3389/fpls.2021.787407
_version_ 1784642520645369856
author Reeb, Rachel A.
Aziz, Naeem
Lapp, Samuel M.
Kitzes, Justin
Heberling, J. Mason
Kuebbing, Sara E.
author_facet Reeb, Rachel A.
Aziz, Naeem
Lapp, Samuel M.
Kitzes, Justin
Heberling, J. Mason
Kuebbing, Sara E.
author_sort Reeb, Rachel A.
collection PubMed
description Community science image libraries offer a massive, but largely untapped, source of observational data for phenological research. The iNaturalist platform offers a particularly rich archive, containing more than 49 million verifiable, georeferenced, open access images, encompassing seven continents and over 278,000 species. A critical limitation preventing scientists from taking full advantage of this rich data source is labor. Each image must be manually inspected and categorized by phenophase, which is both time-intensive and costly. Consequently, researchers may only be able to use a subset of the total number of images available in the database. While iNaturalist has the potential to yield enough data for high-resolution and spatially extensive studies, it requires more efficient tools for phenological data extraction. A promising solution is automation of the image annotation process using deep learning. Recent innovations in deep learning have made these open-source tools accessible to a general research audience. However, it is unknown whether deep learning tools can accurately and efficiently annotate phenophases in community science images. Here, we train a convolutional neural network (CNN) to annotate images of Alliaria petiolata into distinct phenophases from iNaturalist and compare the performance of the model with non-expert human annotators. We demonstrate that researchers can successfully employ deep learning techniques to extract phenological information from community science images. A CNN classified two-stage phenology (flowering and non-flowering) with 95.9% accuracy and classified four-stage phenology (vegetative, budding, flowering, and fruiting) with 86.4% accuracy. The overall accuracy of the CNN did not differ from humans (p = 0.383), although performance varied across phenophases. We found that a primary challenge of using deep learning for image annotation was not related to the model itself, but instead in the quality of the community science images. Up to 4% of A. petiolata images in iNaturalist were taken from an improper distance, were physically manipulated, or were digitally altered, which limited both human and machine annotators in accurately classifying phenology. Thus, we provide a list of photography guidelines that could be included in community science platforms to inform community scientists in the best practices for creating images that facilitate phenological analysis.
format Online
Article
Text
id pubmed-8801702
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-88017022022-02-01 Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images Reeb, Rachel A. Aziz, Naeem Lapp, Samuel M. Kitzes, Justin Heberling, J. Mason Kuebbing, Sara E. Front Plant Sci Plant Science Community science image libraries offer a massive, but largely untapped, source of observational data for phenological research. The iNaturalist platform offers a particularly rich archive, containing more than 49 million verifiable, georeferenced, open access images, encompassing seven continents and over 278,000 species. A critical limitation preventing scientists from taking full advantage of this rich data source is labor. Each image must be manually inspected and categorized by phenophase, which is both time-intensive and costly. Consequently, researchers may only be able to use a subset of the total number of images available in the database. While iNaturalist has the potential to yield enough data for high-resolution and spatially extensive studies, it requires more efficient tools for phenological data extraction. A promising solution is automation of the image annotation process using deep learning. Recent innovations in deep learning have made these open-source tools accessible to a general research audience. However, it is unknown whether deep learning tools can accurately and efficiently annotate phenophases in community science images. Here, we train a convolutional neural network (CNN) to annotate images of Alliaria petiolata into distinct phenophases from iNaturalist and compare the performance of the model with non-expert human annotators. We demonstrate that researchers can successfully employ deep learning techniques to extract phenological information from community science images. A CNN classified two-stage phenology (flowering and non-flowering) with 95.9% accuracy and classified four-stage phenology (vegetative, budding, flowering, and fruiting) with 86.4% accuracy. The overall accuracy of the CNN did not differ from humans (p = 0.383), although performance varied across phenophases. We found that a primary challenge of using deep learning for image annotation was not related to the model itself, but instead in the quality of the community science images. Up to 4% of A. petiolata images in iNaturalist were taken from an improper distance, were physically manipulated, or were digitally altered, which limited both human and machine annotators in accurately classifying phenology. Thus, we provide a list of photography guidelines that could be included in community science platforms to inform community scientists in the best practices for creating images that facilitate phenological analysis. Frontiers Media S.A. 2022-01-17 /pmc/articles/PMC8801702/ /pubmed/35111176 http://dx.doi.org/10.3389/fpls.2021.787407 Text en Copyright © 2022 Reeb, Aziz, Lapp, Kitzes, Heberling and Kuebbing. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Reeb, Rachel A.
Aziz, Naeem
Lapp, Samuel M.
Kitzes, Justin
Heberling, J. Mason
Kuebbing, Sara E.
Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images
title Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images
title_full Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images
title_fullStr Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images
title_full_unstemmed Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images
title_short Using Convolutional Neural Networks to Efficiently Extract Immense Phenological Data From Community Science Images
title_sort using convolutional neural networks to efficiently extract immense phenological data from community science images
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8801702/
https://www.ncbi.nlm.nih.gov/pubmed/35111176
http://dx.doi.org/10.3389/fpls.2021.787407
work_keys_str_mv AT reebrachela usingconvolutionalneuralnetworkstoefficientlyextractimmensephenologicaldatafromcommunityscienceimages
AT aziznaeem usingconvolutionalneuralnetworkstoefficientlyextractimmensephenologicaldatafromcommunityscienceimages
AT lappsamuelm usingconvolutionalneuralnetworkstoefficientlyextractimmensephenologicaldatafromcommunityscienceimages
AT kitzesjustin usingconvolutionalneuralnetworkstoefficientlyextractimmensephenologicaldatafromcommunityscienceimages
AT heberlingjmason usingconvolutionalneuralnetworkstoefficientlyextractimmensephenologicaldatafromcommunityscienceimages
AT kuebbingsarae usingconvolutionalneuralnetworkstoefficientlyextractimmensephenologicaldatafromcommunityscienceimages