Cargando…

Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain

BACKGROUND: Automated species identification is a long term research subject. Contrary to flowers and fruits, leaves are available throughout most of the year. Offering margin and texture to characterize a species, they are the most studied organ for automated identification. Substantially matured m...

Descripción completa

Detalles Bibliográficos
Autores principales: Rzanny, Michael, Seeland, Marco, Wäldchen, Jana, Mäder, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5678587/
https://www.ncbi.nlm.nih.gov/pubmed/29151843
http://dx.doi.org/10.1186/s13007-017-0245-8
_version_ 1783277468998696960
author Rzanny, Michael
Seeland, Marco
Wäldchen, Jana
Mäder, Patrick
author_facet Rzanny, Michael
Seeland, Marco
Wäldchen, Jana
Mäder, Patrick
author_sort Rzanny, Michael
collection PubMed
description BACKGROUND: Automated species identification is a long term research subject. Contrary to flowers and fruits, leaves are available throughout most of the year. Offering margin and texture to characterize a species, they are the most studied organ for automated identification. Substantially matured machine learning techniques generate the need for more training data (aka leaf images). Researchers as well as enthusiasts miss guidance on how to acquire suitable training images in an efficient way. METHODS: In this paper, we systematically study nine image types and three preprocessing strategies. Image types vary in terms of in-situ image recording conditions: perspective, illumination, and background, while the preprocessing strategies compare non-preprocessed, cropped, and segmented images to each other. Per image type-preprocessing combination, we also quantify the manual effort required for their implementation. We extract image features using a convolutional neural network, classify species using the resulting feature vectors and discuss classification accuracy in relation to the required effort per combination. RESULTS: The most effective, non-destructive way to record herbaceous leaves is to take an image of the leaf’s top side. We yield the highest classification accuracy using destructive back light images, i.e., holding the plucked leaf against the sky for image acquisition. Cropping the image to the leaf’s boundary substantially improves accuracy, while precise segmentation yields similar accuracy at a substantially higher effort. The permanent use or disuse of a flash light has negligible effects. Imaging the typically stronger textured backside of a leaf does not result in higher accuracy, but notably increases the acquisition cost. CONCLUSIONS: In conclusion, the way in which leaf images are acquired and preprocessed does have a substantial effect on the accuracy of the classifier trained on them. For the first time, this study provides a systematic guideline allowing researchers to spend available acquisition resources wisely while yielding the optimal classification accuracy.
format Online
Article
Text
id pubmed-5678587
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-56785872017-11-17 Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain Rzanny, Michael Seeland, Marco Wäldchen, Jana Mäder, Patrick Plant Methods Research BACKGROUND: Automated species identification is a long term research subject. Contrary to flowers and fruits, leaves are available throughout most of the year. Offering margin and texture to characterize a species, they are the most studied organ for automated identification. Substantially matured machine learning techniques generate the need for more training data (aka leaf images). Researchers as well as enthusiasts miss guidance on how to acquire suitable training images in an efficient way. METHODS: In this paper, we systematically study nine image types and three preprocessing strategies. Image types vary in terms of in-situ image recording conditions: perspective, illumination, and background, while the preprocessing strategies compare non-preprocessed, cropped, and segmented images to each other. Per image type-preprocessing combination, we also quantify the manual effort required for their implementation. We extract image features using a convolutional neural network, classify species using the resulting feature vectors and discuss classification accuracy in relation to the required effort per combination. RESULTS: The most effective, non-destructive way to record herbaceous leaves is to take an image of the leaf’s top side. We yield the highest classification accuracy using destructive back light images, i.e., holding the plucked leaf against the sky for image acquisition. Cropping the image to the leaf’s boundary substantially improves accuracy, while precise segmentation yields similar accuracy at a substantially higher effort. The permanent use or disuse of a flash light has negligible effects. Imaging the typically stronger textured backside of a leaf does not result in higher accuracy, but notably increases the acquisition cost. CONCLUSIONS: In conclusion, the way in which leaf images are acquired and preprocessed does have a substantial effect on the accuracy of the classifier trained on them. For the first time, this study provides a systematic guideline allowing researchers to spend available acquisition resources wisely while yielding the optimal classification accuracy. BioMed Central 2017-11-08 /pmc/articles/PMC5678587/ /pubmed/29151843 http://dx.doi.org/10.1186/s13007-017-0245-8 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Rzanny, Michael
Seeland, Marco
Wäldchen, Jana
Mäder, Patrick
Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
title Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
title_full Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
title_fullStr Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
title_full_unstemmed Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
title_short Acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
title_sort acquiring and preprocessing leaf images for automated plant identification: understanding the tradeoff between effort and information gain
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5678587/
https://www.ncbi.nlm.nih.gov/pubmed/29151843
http://dx.doi.org/10.1186/s13007-017-0245-8
work_keys_str_mv AT rzannymichael acquiringandpreprocessingleafimagesforautomatedplantidentificationunderstandingthetradeoffbetweeneffortandinformationgain
AT seelandmarco acquiringandpreprocessingleafimagesforautomatedplantidentificationunderstandingthetradeoffbetweeneffortandinformationgain
AT waldchenjana acquiringandpreprocessingleafimagesforautomatedplantidentificationunderstandingthetradeoffbetweeneffortandinformationgain
AT maderpatrick acquiringandpreprocessingleafimagesforautomatedplantidentificationunderstandingthetradeoffbetweeneffortandinformationgain