Cargando…
Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets
PREMISE: Digitization and imaging of herbarium specimens provides essential historical phenotypic and phenological information about plants. However, the full use of these resources requires high‐quality human annotations for downstream use. Here we provide guidance on the design and implementation...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328657/ https://www.ncbi.nlm.nih.gov/pubmed/32626612 http://dx.doi.org/10.1002/aps3.11370 |
_version_ | 1783552770644639744 |
---|---|
author | Brenskelle, Laura Guralnick, Rob P. Denslow, Michael Stucky, Brian J. |
author_facet | Brenskelle, Laura Guralnick, Rob P. Denslow, Michael Stucky, Brian J. |
author_sort | Brenskelle, Laura |
collection | PubMed |
description | PREMISE: Digitization and imaging of herbarium specimens provides essential historical phenotypic and phenological information about plants. However, the full use of these resources requires high‐quality human annotations for downstream use. Here we provide guidance on the design and implementation of image annotation projects for botanical research. METHODS AND RESULTS: We used a novel gold‐standard data set to test the accuracy of human phenological annotations of herbarium specimen images in two settings: structured, in‐person sessions and an online, community‐science platform. We examined how different factors influenced annotation accuracy and found that botanical expertise, academic career level, and time spent on annotations had little effect on accuracy. Rather, key factors included traits and taxa being scored, the annotation setting, and the individual scorer. In‐person annotations were significantly more accurate than online annotations, but both generated relatively high‐quality outputs. Gathering multiple, independent annotations for each image improved overall accuracy. CONCLUSIONS: Our results provide a best‐practices basis for using human effort to annotate images of plants. We show that scalable community science mechanisms can produce high‐quality data, but care must be taken to choose tractable taxa and phenophases and to provide informative training material. |
format | Online Article Text |
id | pubmed-7328657 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73286572020-07-02 Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets Brenskelle, Laura Guralnick, Rob P. Denslow, Michael Stucky, Brian J. Appl Plant Sci Protocol Notes PREMISE: Digitization and imaging of herbarium specimens provides essential historical phenotypic and phenological information about plants. However, the full use of these resources requires high‐quality human annotations for downstream use. Here we provide guidance on the design and implementation of image annotation projects for botanical research. METHODS AND RESULTS: We used a novel gold‐standard data set to test the accuracy of human phenological annotations of herbarium specimen images in two settings: structured, in‐person sessions and an online, community‐science platform. We examined how different factors influenced annotation accuracy and found that botanical expertise, academic career level, and time spent on annotations had little effect on accuracy. Rather, key factors included traits and taxa being scored, the annotation setting, and the individual scorer. In‐person annotations were significantly more accurate than online annotations, but both generated relatively high‐quality outputs. Gathering multiple, independent annotations for each image improved overall accuracy. CONCLUSIONS: Our results provide a best‐practices basis for using human effort to annotate images of plants. We show that scalable community science mechanisms can produce high‐quality data, but care must be taken to choose tractable taxa and phenophases and to provide informative training material. John Wiley and Sons Inc. 2020-07-01 /pmc/articles/PMC7328657/ /pubmed/32626612 http://dx.doi.org/10.1002/aps3.11370 Text en © 2020 Brenskelle et al. Applications in Plant Sciences is published by Wiley Periodicals, LLC on behalf of the Botanical Society of America This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Protocol Notes Brenskelle, Laura Guralnick, Rob P. Denslow, Michael Stucky, Brian J. Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets |
title | Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets |
title_full | Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets |
title_fullStr | Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets |
title_full_unstemmed | Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets |
title_short | Maximizing human effort for analyzing scientific images: A case study using digitized herbarium sheets |
title_sort | maximizing human effort for analyzing scientific images: a case study using digitized herbarium sheets |
topic | Protocol Notes |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7328657/ https://www.ncbi.nlm.nih.gov/pubmed/32626612 http://dx.doi.org/10.1002/aps3.11370 |
work_keys_str_mv | AT brenskellelaura maximizinghumaneffortforanalyzingscientificimagesacasestudyusingdigitizedherbariumsheets AT guralnickrobp maximizinghumaneffortforanalyzingscientificimagesacasestudyusingdigitizedherbariumsheets AT denslowmichael maximizinghumaneffortforanalyzingscientificimagesacasestudyusingdigitizedherbariumsheets AT stuckybrianj maximizinghumaneffortforanalyzingscientificimagesacasestudyusingdigitizedherbariumsheets |