Cargando…

Recommending plant taxa for supporting on-site species identification

BACKGROUND: Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a sho...

Descripción completa

Detalles Bibliográficos
Autores principales: Wittich, Hans Christian, Seeland, Marco, Wäldchen, Jana, Rzanny, Michael, Mäder, Patrick
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5975699/
https://www.ncbi.nlm.nih.gov/pubmed/29843588
http://dx.doi.org/10.1186/s12859-018-2201-7
_version_ 1783327037430169600
author Wittich, Hans Christian
Seeland, Marco
Wäldchen, Jana
Rzanny, Michael
Mäder, Patrick
author_facet Wittich, Hans Christian
Seeland, Marco
Wäldchen, Jana
Rzanny, Michael
Mäder, Patrick
author_sort Wittich, Hans Christian
collection PubMed
description BACKGROUND: Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a shortlist of likely candidates can help significantly expedite the task. Whereas species distribution models heavily rely on geo-referenced occurrence data, such information still remains largely unused for plant taxa identification tools. RESULTS: In this paper, we conduct a study on the feasibility of computing a ranked shortlist of plant taxa likely to be encountered by an observer in the field. We use the territory of Germany as case study with a total of 7.62M records of freely available plant presence-absence data and occurrence records for 2.7k plant taxa. We systematically study achievable recommendation quality based on two types of source data: binary presence-absence data and individual occurrence records. Furthermore, we study strategies for aggregating records into a taxa recommendation based on location and date of an observation. CONCLUSION: We evaluate recommendations using 28k geo-referenced and taxa-labeled plant images hosted on the Flickr website as an independent test dataset. Relying on location information from presence-absence data alone results in an average recall of 82%. However, we find that occurrence records are complementary to presence-absence data and using both in combination yields considerably higher recall of 96% along with improved ranking metrics. Ultimately, by reducing the list of candidate taxa by an average of 62%, a spatio-temporal prior can substantially expedite the overall identification problem.
format Online
Article
Text
id pubmed-5975699
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-59756992018-05-31 Recommending plant taxa for supporting on-site species identification Wittich, Hans Christian Seeland, Marco Wäldchen, Jana Rzanny, Michael Mäder, Patrick BMC Bioinformatics Research Article BACKGROUND: Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a shortlist of likely candidates can help significantly expedite the task. Whereas species distribution models heavily rely on geo-referenced occurrence data, such information still remains largely unused for plant taxa identification tools. RESULTS: In this paper, we conduct a study on the feasibility of computing a ranked shortlist of plant taxa likely to be encountered by an observer in the field. We use the territory of Germany as case study with a total of 7.62M records of freely available plant presence-absence data and occurrence records for 2.7k plant taxa. We systematically study achievable recommendation quality based on two types of source data: binary presence-absence data and individual occurrence records. Furthermore, we study strategies for aggregating records into a taxa recommendation based on location and date of an observation. CONCLUSION: We evaluate recommendations using 28k geo-referenced and taxa-labeled plant images hosted on the Flickr website as an independent test dataset. Relying on location information from presence-absence data alone results in an average recall of 82%. However, we find that occurrence records are complementary to presence-absence data and using both in combination yields considerably higher recall of 96% along with improved ranking metrics. Ultimately, by reducing the list of candidate taxa by an average of 62%, a spatio-temporal prior can substantially expedite the overall identification problem. BioMed Central 2018-05-30 /pmc/articles/PMC5975699/ /pubmed/29843588 http://dx.doi.org/10.1186/s12859-018-2201-7 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Wittich, Hans Christian
Seeland, Marco
Wäldchen, Jana
Rzanny, Michael
Mäder, Patrick
Recommending plant taxa for supporting on-site species identification
title Recommending plant taxa for supporting on-site species identification
title_full Recommending plant taxa for supporting on-site species identification
title_fullStr Recommending plant taxa for supporting on-site species identification
title_full_unstemmed Recommending plant taxa for supporting on-site species identification
title_short Recommending plant taxa for supporting on-site species identification
title_sort recommending plant taxa for supporting on-site species identification
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5975699/
https://www.ncbi.nlm.nih.gov/pubmed/29843588
http://dx.doi.org/10.1186/s12859-018-2201-7
work_keys_str_mv AT wittichhanschristian recommendingplanttaxaforsupportingonsitespeciesidentification
AT seelandmarco recommendingplanttaxaforsupportingonsitespeciesidentification
AT waldchenjana recommendingplanttaxaforsupportingonsitespeciesidentification
AT rzannymichael recommendingplanttaxaforsupportingonsitespeciesidentification
AT maderpatrick recommendingplanttaxaforsupportingonsitespeciesidentification