Cargando…

Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification

BACKGROUND: Global Plants, a collaborative between JSTOR and some 300 herbaria, now contains about 2.48 million high-resolution images of plant specimens, a number that continues to grow, and collections that are digitizing their specimens at high resolution are allocating considerable recourses to...

Descripción completa

Detalles Bibliográficos
Autores principales:	Unger, Jakob, Merhof, Dorit, Renner, Susanne
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2016
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5112707/ https://www.ncbi.nlm.nih.gov/pubmed/27852219 http://dx.doi.org/10.1186/s12862-016-0827-5

_version_	1782468056630427648
author	Unger, Jakob Merhof, Dorit Renner, Susanne
author_facet	Unger, Jakob Merhof, Dorit Renner, Susanne
author_sort	Unger, Jakob
collection	PubMed
description	BACKGROUND: Global Plants, a collaborative between JSTOR and some 300 herbaria, now contains about 2.48 million high-resolution images of plant specimens, a number that continues to grow, and collections that are digitizing their specimens at high resolution are allocating considerable recourses to the maintenance of computer hardware (e.g., servers) and to acquiring digital storage space. We here apply machine learning, specifically the training of a Support-Vector-Machine, to classify specimen images into categories, ideally at the species level, using the 26 most common tree species in Germany as a test case. RESULTS: We designed an analysis pipeline and classification system consisting of segmentation, normalization, feature extraction, and classification steps and evaluated the system in two test sets, one with 26 species, the other with 17, in each case using 10 images per species of plants collected between 1820 and 1995, which simulates the empirical situation that most named species are represented in herbaria and databases, such as JSTOR, by few specimens. We achieved 73.21% accuracy of species assignments in the larger test set, and 84.88% in the smaller test set. CONCLUSIONS: The results of this first application of a computer vision algorithm trained on images of herbarium specimens shows that despite the problem of overlapping leaves, leaf-architectural features can be used to categorize specimens to species with good accuracy. Computer vision is poised to play a significant role in future rapid identification at least for frequently collected genera or species in the European flora. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12862-016-0827-5) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-5112707
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-51127072016-11-25 Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification Unger, Jakob Merhof, Dorit Renner, Susanne BMC Evol Biol Methodology Article BACKGROUND: Global Plants, a collaborative between JSTOR and some 300 herbaria, now contains about 2.48 million high-resolution images of plant specimens, a number that continues to grow, and collections that are digitizing their specimens at high resolution are allocating considerable recourses to the maintenance of computer hardware (e.g., servers) and to acquiring digital storage space. We here apply machine learning, specifically the training of a Support-Vector-Machine, to classify specimen images into categories, ideally at the species level, using the 26 most common tree species in Germany as a test case. RESULTS: We designed an analysis pipeline and classification system consisting of segmentation, normalization, feature extraction, and classification steps and evaluated the system in two test sets, one with 26 species, the other with 17, in each case using 10 images per species of plants collected between 1820 and 1995, which simulates the empirical situation that most named species are represented in herbaria and databases, such as JSTOR, by few specimens. We achieved 73.21% accuracy of species assignments in the larger test set, and 84.88% in the smaller test set. CONCLUSIONS: The results of this first application of a computer vision algorithm trained on images of herbarium specimens shows that despite the problem of overlapping leaves, leaf-architectural features can be used to categorize specimens to species with good accuracy. Computer vision is poised to play a significant role in future rapid identification at least for frequently collected genera or species in the European flora. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12862-016-0827-5) contains supplementary material, which is available to authorized users. BioMed Central 2016-11-16 /pmc/articles/PMC5112707/ /pubmed/27852219 http://dx.doi.org/10.1186/s12862-016-0827-5 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Unger, Jakob Merhof, Dorit Renner, Susanne Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification
title	Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification
title_full	Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification
title_fullStr	Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification
title_full_unstemmed	Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification
title_short	Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification
title_sort	computer vision applied to herbarium specimens of german trees: testing the future utility of the millions of herbarium specimen images for automated identification
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5112707/ https://www.ncbi.nlm.nih.gov/pubmed/27852219 http://dx.doi.org/10.1186/s12862-016-0827-5
work_keys_str_mv	AT ungerjakob computervisionappliedtoherbariumspecimensofgermantreestestingthefutureutilityofthemillionsofherbariumspecimenimagesforautomatedidentification AT merhofdorit computervisionappliedtoherbariumspecimensofgermantreestestingthefutureutilityofthemillionsofherbariumspecimenimagesforautomatedidentification AT rennersusanne computervisionappliedtoherbariumspecimensofgermantreestestingthefutureutilityofthemillionsofherbariumspecimenimagesforautomatedidentification

Computer vision applied to herbarium specimens of German trees: testing the future utility of the millions of herbarium specimen images for automated identification

Ejemplares similares