Cargando…

Going deeper in the automated identification of Herbarium specimens

BACKGROUND: Hundreds of herbarium collections have accumulated a valuable heritage and knowledge of plants over several centuries. Recent initiatives started ambitious preservation plans to digitize this information and make it available to botanists and the general public through web portals. Howev...

Descripción completa

Detalles Bibliográficos
Autores principales: Carranza-Rojas, Jose, Goeau, Herve, Bonnet, Pierre, Mata-Montero, Erick, Joly, Alexis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5553807/
https://www.ncbi.nlm.nih.gov/pubmed/28797242
http://dx.doi.org/10.1186/s12862-017-1014-z
_version_ 1783256679614251008
author Carranza-Rojas, Jose
Goeau, Herve
Bonnet, Pierre
Mata-Montero, Erick
Joly, Alexis
author_facet Carranza-Rojas, Jose
Goeau, Herve
Bonnet, Pierre
Mata-Montero, Erick
Joly, Alexis
author_sort Carranza-Rojas, Jose
collection PubMed
description BACKGROUND: Hundreds of herbarium collections have accumulated a valuable heritage and knowledge of plants over several centuries. Recent initiatives started ambitious preservation plans to digitize this information and make it available to botanists and the general public through web portals. However, thousands of sheets are still unidentified at the species level while numerous sheets should be reviewed and updated following more recent taxonomic knowledge. These annotations and revisions require an unrealistic amount of work for botanists to carry out in a reasonable time. Computer vision and machine learning approaches applied to herbarium sheets are promising but are still not well studied compared to automated species identification from leaf scans or pictures of plants in the field. RESULTS: In this work, we propose to study and evaluate the accuracy with which herbarium images can be potentially exploited for species identification with deep learning technology. In addition, we propose to study if the combination of herbarium sheets with photos of plants in the field is relevant in terms of accuracy, and finally, we explore if herbarium images from one region that has one specific flora can be used to do transfer learning to another region with other species; for example, on a region under-represented in terms of collected data. CONCLUSIONS: This is, to our knowledge, the first study that uses deep learning to analyze a big dataset with thousands of species from herbaria. Results show the potential of Deep Learning on herbarium species identification, particularly by training and testing across different datasets from different herbaria. This could potentially lead to the creation of a semi, or even fully automated system to help taxonomists and experts with their annotation, classification, and revision works.
format Online
Article
Text
id pubmed-5553807
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-55538072017-08-15 Going deeper in the automated identification of Herbarium specimens Carranza-Rojas, Jose Goeau, Herve Bonnet, Pierre Mata-Montero, Erick Joly, Alexis BMC Evol Biol Research Article BACKGROUND: Hundreds of herbarium collections have accumulated a valuable heritage and knowledge of plants over several centuries. Recent initiatives started ambitious preservation plans to digitize this information and make it available to botanists and the general public through web portals. However, thousands of sheets are still unidentified at the species level while numerous sheets should be reviewed and updated following more recent taxonomic knowledge. These annotations and revisions require an unrealistic amount of work for botanists to carry out in a reasonable time. Computer vision and machine learning approaches applied to herbarium sheets are promising but are still not well studied compared to automated species identification from leaf scans or pictures of plants in the field. RESULTS: In this work, we propose to study and evaluate the accuracy with which herbarium images can be potentially exploited for species identification with deep learning technology. In addition, we propose to study if the combination of herbarium sheets with photos of plants in the field is relevant in terms of accuracy, and finally, we explore if herbarium images from one region that has one specific flora can be used to do transfer learning to another region with other species; for example, on a region under-represented in terms of collected data. CONCLUSIONS: This is, to our knowledge, the first study that uses deep learning to analyze a big dataset with thousands of species from herbaria. Results show the potential of Deep Learning on herbarium species identification, particularly by training and testing across different datasets from different herbaria. This could potentially lead to the creation of a semi, or even fully automated system to help taxonomists and experts with their annotation, classification, and revision works. BioMed Central 2017-08-11 /pmc/articles/PMC5553807/ /pubmed/28797242 http://dx.doi.org/10.1186/s12862-017-1014-z Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Carranza-Rojas, Jose
Goeau, Herve
Bonnet, Pierre
Mata-Montero, Erick
Joly, Alexis
Going deeper in the automated identification of Herbarium specimens
title Going deeper in the automated identification of Herbarium specimens
title_full Going deeper in the automated identification of Herbarium specimens
title_fullStr Going deeper in the automated identification of Herbarium specimens
title_full_unstemmed Going deeper in the automated identification of Herbarium specimens
title_short Going deeper in the automated identification of Herbarium specimens
title_sort going deeper in the automated identification of herbarium specimens
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5553807/
https://www.ncbi.nlm.nih.gov/pubmed/28797242
http://dx.doi.org/10.1186/s12862-017-1014-z
work_keys_str_mv AT carranzarojasjose goingdeeperintheautomatedidentificationofherbariumspecimens
AT goeauherve goingdeeperintheautomatedidentificationofherbariumspecimens
AT bonnetpierre goingdeeperintheautomatedidentificationofherbariumspecimens
AT matamonteroerick goingdeeperintheautomatedidentificationofherbariumspecimens
AT jolyalexis goingdeeperintheautomatedidentificationofherbariumspecimens