Cargando…

From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2

PREMISE: Quantitative plant traits play a crucial role in biological research. However, traditional methods for measuring plant morphology are time consuming and have limited scalability. We present LeafMachine2, a suite of modular machine learning and computer vision tools that can automatically ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Weaver, William N., Smith, Stephen A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10617304/
https://www.ncbi.nlm.nih.gov/pubmed/37915430
http://dx.doi.org/10.1002/aps3.11548
_version_ 1785129581189005312
author Weaver, William N.
Smith, Stephen A.
author_facet Weaver, William N.
Smith, Stephen A.
author_sort Weaver, William N.
collection PubMed
description PREMISE: Quantitative plant traits play a crucial role in biological research. However, traditional methods for measuring plant morphology are time consuming and have limited scalability. We present LeafMachine2, a suite of modular machine learning and computer vision tools that can automatically extract a base set of leaf traits from digital plant data sets. METHODS: LeafMachine2 was trained on 494,766 manually prepared annotations from 5648 herbarium images obtained from 288 institutions and representing 2663 species; it employs a set of plant component detection and segmentation algorithms to isolate individual leaves, petioles, fruits, flowers, wood samples, buds, and roots. Our landmarking network automatically identifies and measures nine pseudo‐landmarks that occur on most broadleaf taxa. Text labels and barcodes are automatically identified by an archival component detector and are prepared for optical character recognition methods or natural language processing algorithms. RESULTS: LeafMachine2 can extract trait data from at least 245 angiosperm families and calculate pixel‐to‐metric conversion factors for 26 commonly used ruler types. DISCUSSION: LeafMachine2 is a highly efficient tool for generating large quantities of plant trait data, even from occluded or overlapping leaves, field images, and non‐archival data sets. Our project, along with similar initiatives, has made significant progress in removing the bottleneck in plant trait data acquisition from herbarium specimens and shifted the focus toward the crucial task of data revision and quality control.
format Online
Article
Text
id pubmed-10617304
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-106173042023-11-01 From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2 Weaver, William N. Smith, Stephen A. Appl Plant Sci Application Articles PREMISE: Quantitative plant traits play a crucial role in biological research. However, traditional methods for measuring plant morphology are time consuming and have limited scalability. We present LeafMachine2, a suite of modular machine learning and computer vision tools that can automatically extract a base set of leaf traits from digital plant data sets. METHODS: LeafMachine2 was trained on 494,766 manually prepared annotations from 5648 herbarium images obtained from 288 institutions and representing 2663 species; it employs a set of plant component detection and segmentation algorithms to isolate individual leaves, petioles, fruits, flowers, wood samples, buds, and roots. Our landmarking network automatically identifies and measures nine pseudo‐landmarks that occur on most broadleaf taxa. Text labels and barcodes are automatically identified by an archival component detector and are prepared for optical character recognition methods or natural language processing algorithms. RESULTS: LeafMachine2 can extract trait data from at least 245 angiosperm families and calculate pixel‐to‐metric conversion factors for 26 commonly used ruler types. DISCUSSION: LeafMachine2 is a highly efficient tool for generating large quantities of plant trait data, even from occluded or overlapping leaves, field images, and non‐archival data sets. Our project, along with similar initiatives, has made significant progress in removing the bottleneck in plant trait data acquisition from herbarium specimens and shifted the focus toward the crucial task of data revision and quality control. John Wiley and Sons Inc. 2023-10-16 /pmc/articles/PMC10617304/ /pubmed/37915430 http://dx.doi.org/10.1002/aps3.11548 Text en © 2023 The Authors. Applications in Plant Sciences published by Wiley Periodicals LLC on behalf of Botanical Society of America. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Application Articles
Weaver, William N.
Smith, Stephen A.
From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2
title From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2
title_full From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2
title_fullStr From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2
title_full_unstemmed From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2
title_short From leaves to labels: Building modular machine learning networks for rapid herbarium specimen analysis with LeafMachine2
title_sort from leaves to labels: building modular machine learning networks for rapid herbarium specimen analysis with leafmachine2
topic Application Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10617304/
https://www.ncbi.nlm.nih.gov/pubmed/37915430
http://dx.doi.org/10.1002/aps3.11548
work_keys_str_mv AT weaverwilliamn fromleavestolabelsbuildingmodularmachinelearningnetworksforrapidherbariumspecimenanalysiswithleafmachine2
AT smithstephena fromleavestolabelsbuildingmodularmachinelearningnetworksforrapidherbariumspecimenanalysiswithleafmachine2