Cargando…

A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN

Phenology—the timing of life-history events—is a key trait for understanding responses of organisms to climate. The digitization and online mobilization of herbarium specimens is rapidly advancing our understanding of plant phenological response to climate and climatic change. The current practice o...

Descripción completa

Detalles Bibliográficos
Autores principales: Davis, Charles C., Champ, Julien, Park, Daniel S., Breckheimer, Ian, Lyra, Goia M., Xie, Junxi, Joly, Alexis, Tarapore, Dharmesh, Ellison, Aaron M., Bonnet, Pierre
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7411132/
https://www.ncbi.nlm.nih.gov/pubmed/32849691
http://dx.doi.org/10.3389/fpls.2020.01129
_version_ 1783568311070490624
author Davis, Charles C.
Champ, Julien
Park, Daniel S.
Breckheimer, Ian
Lyra, Goia M.
Xie, Junxi
Joly, Alexis
Tarapore, Dharmesh
Ellison, Aaron M.
Bonnet, Pierre
author_facet Davis, Charles C.
Champ, Julien
Park, Daniel S.
Breckheimer, Ian
Lyra, Goia M.
Xie, Junxi
Joly, Alexis
Tarapore, Dharmesh
Ellison, Aaron M.
Bonnet, Pierre
author_sort Davis, Charles C.
collection PubMed
description Phenology—the timing of life-history events—is a key trait for understanding responses of organisms to climate. The digitization and online mobilization of herbarium specimens is rapidly advancing our understanding of plant phenological response to climate and climatic change. The current practice of manually harvesting data from individual specimens, however, greatly restricts our ability to scale-up data collection. Recent investigations have demonstrated that machine-learning approaches can facilitate this effort. However, present attempts have focused largely on simplistic binary coding of reproductive phenology (e.g., presence/absence of flowers). Here, we use crowd-sourced phenological data of buds, flowers, and fruits from >3,000 specimens of six common wildflower species of the eastern United States (Anemone canadensis L., A. hepatica L., A. quinquefolia L., Trillium erectum L., T. grandiflorum (Michx.) Salisb., and T. undulatum Wild.) to train models using Mask R-CNN to segment and count phenological features. A single global model was able to automate the binary coding of each of the three reproductive stages with >87% accuracy. We also successfully estimated the relative abundance of each reproductive structure on a specimen with ≥90% accuracy. Precise counting of features was also successful, but accuracy varied with phenological stage and taxon. Specifically, counting flowers was significantly less accurate than buds or fruits likely due to their morphological variability on pressed specimens. Moreover, our Mask R-CNN model provided more reliable data than non-expert crowd-sourcers but not botanical experts, highlighting the importance of high-quality human training data. Finally, we also demonstrated the transferability of our model to automated phenophase detection and counting of the three Trillium species, which have large and conspicuously-shaped reproductive organs. These results highlight the promise of our two-phase crowd-sourcing and machine-learning pipeline to segment and count reproductive features of herbarium specimens, thus providing high-quality data with which to investigate plant responses to ongoing climatic change.
format Online
Article
Text
id pubmed-7411132
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-74111322020-08-25 A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN Davis, Charles C. Champ, Julien Park, Daniel S. Breckheimer, Ian Lyra, Goia M. Xie, Junxi Joly, Alexis Tarapore, Dharmesh Ellison, Aaron M. Bonnet, Pierre Front Plant Sci Plant Science Phenology—the timing of life-history events—is a key trait for understanding responses of organisms to climate. The digitization and online mobilization of herbarium specimens is rapidly advancing our understanding of plant phenological response to climate and climatic change. The current practice of manually harvesting data from individual specimens, however, greatly restricts our ability to scale-up data collection. Recent investigations have demonstrated that machine-learning approaches can facilitate this effort. However, present attempts have focused largely on simplistic binary coding of reproductive phenology (e.g., presence/absence of flowers). Here, we use crowd-sourced phenological data of buds, flowers, and fruits from >3,000 specimens of six common wildflower species of the eastern United States (Anemone canadensis L., A. hepatica L., A. quinquefolia L., Trillium erectum L., T. grandiflorum (Michx.) Salisb., and T. undulatum Wild.) to train models using Mask R-CNN to segment and count phenological features. A single global model was able to automate the binary coding of each of the three reproductive stages with >87% accuracy. We also successfully estimated the relative abundance of each reproductive structure on a specimen with ≥90% accuracy. Precise counting of features was also successful, but accuracy varied with phenological stage and taxon. Specifically, counting flowers was significantly less accurate than buds or fruits likely due to their morphological variability on pressed specimens. Moreover, our Mask R-CNN model provided more reliable data than non-expert crowd-sourcers but not botanical experts, highlighting the importance of high-quality human training data. Finally, we also demonstrated the transferability of our model to automated phenophase detection and counting of the three Trillium species, which have large and conspicuously-shaped reproductive organs. These results highlight the promise of our two-phase crowd-sourcing and machine-learning pipeline to segment and count reproductive features of herbarium specimens, thus providing high-quality data with which to investigate plant responses to ongoing climatic change. Frontiers Media S.A. 2020-07-31 /pmc/articles/PMC7411132/ /pubmed/32849691 http://dx.doi.org/10.3389/fpls.2020.01129 Text en Copyright © 2020 Davis, Champ, Park, Breckheimer, Lyra, Xie, Joly, Tarapore, Ellison and Bonnet http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Plant Science
Davis, Charles C.
Champ, Julien
Park, Daniel S.
Breckheimer, Ian
Lyra, Goia M.
Xie, Junxi
Joly, Alexis
Tarapore, Dharmesh
Ellison, Aaron M.
Bonnet, Pierre
A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN
title A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN
title_full A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN
title_fullStr A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN
title_full_unstemmed A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN
title_short A New Method for Counting Reproductive Structures in Digitized Herbarium Specimens Using Mask R-CNN
title_sort new method for counting reproductive structures in digitized herbarium specimens using mask r-cnn
topic Plant Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7411132/
https://www.ncbi.nlm.nih.gov/pubmed/32849691
http://dx.doi.org/10.3389/fpls.2020.01129
work_keys_str_mv AT davischarlesc anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT champjulien anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT parkdaniels anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT breckheimerian anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT lyragoiam anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT xiejunxi anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT jolyalexis anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT taraporedharmesh anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT ellisonaaronm anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT bonnetpierre anewmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT davischarlesc newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT champjulien newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT parkdaniels newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT breckheimerian newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT lyragoiam newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT xiejunxi newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT jolyalexis newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT taraporedharmesh newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT ellisonaaronm newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn
AT bonnetpierre newmethodforcountingreproductivestructuresindigitizedherbariumspecimensusingmaskrcnn