Cargando…
Recommending plant taxa for supporting on-site species identification
BACKGROUND: Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a sho...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5975699/ https://www.ncbi.nlm.nih.gov/pubmed/29843588 http://dx.doi.org/10.1186/s12859-018-2201-7 |
_version_ | 1783327037430169600 |
---|---|
author | Wittich, Hans Christian Seeland, Marco Wäldchen, Jana Rzanny, Michael Mäder, Patrick |
author_facet | Wittich, Hans Christian Seeland, Marco Wäldchen, Jana Rzanny, Michael Mäder, Patrick |
author_sort | Wittich, Hans Christian |
collection | PubMed |
description | BACKGROUND: Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a shortlist of likely candidates can help significantly expedite the task. Whereas species distribution models heavily rely on geo-referenced occurrence data, such information still remains largely unused for plant taxa identification tools. RESULTS: In this paper, we conduct a study on the feasibility of computing a ranked shortlist of plant taxa likely to be encountered by an observer in the field. We use the territory of Germany as case study with a total of 7.62M records of freely available plant presence-absence data and occurrence records for 2.7k plant taxa. We systematically study achievable recommendation quality based on two types of source data: binary presence-absence data and individual occurrence records. Furthermore, we study strategies for aggregating records into a taxa recommendation based on location and date of an observation. CONCLUSION: We evaluate recommendations using 28k geo-referenced and taxa-labeled plant images hosted on the Flickr website as an independent test dataset. Relying on location information from presence-absence data alone results in an average recall of 82%. However, we find that occurrence records are complementary to presence-absence data and using both in combination yields considerably higher recall of 96% along with improved ranking metrics. Ultimately, by reducing the list of candidate taxa by an average of 62%, a spatio-temporal prior can substantially expedite the overall identification problem. |
format | Online Article Text |
id | pubmed-5975699 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-59756992018-05-31 Recommending plant taxa for supporting on-site species identification Wittich, Hans Christian Seeland, Marco Wäldchen, Jana Rzanny, Michael Mäder, Patrick BMC Bioinformatics Research Article BACKGROUND: Predicting a list of plant taxa most likely to be observed at a given geographical location and time is useful for many scenarios in biodiversity informatics. Since efficient plant species identification is impeded mainly by the large number of possible candidate species, providing a shortlist of likely candidates can help significantly expedite the task. Whereas species distribution models heavily rely on geo-referenced occurrence data, such information still remains largely unused for plant taxa identification tools. RESULTS: In this paper, we conduct a study on the feasibility of computing a ranked shortlist of plant taxa likely to be encountered by an observer in the field. We use the territory of Germany as case study with a total of 7.62M records of freely available plant presence-absence data and occurrence records for 2.7k plant taxa. We systematically study achievable recommendation quality based on two types of source data: binary presence-absence data and individual occurrence records. Furthermore, we study strategies for aggregating records into a taxa recommendation based on location and date of an observation. CONCLUSION: We evaluate recommendations using 28k geo-referenced and taxa-labeled plant images hosted on the Flickr website as an independent test dataset. Relying on location information from presence-absence data alone results in an average recall of 82%. However, we find that occurrence records are complementary to presence-absence data and using both in combination yields considerably higher recall of 96% along with improved ranking metrics. Ultimately, by reducing the list of candidate taxa by an average of 62%, a spatio-temporal prior can substantially expedite the overall identification problem. BioMed Central 2018-05-30 /pmc/articles/PMC5975699/ /pubmed/29843588 http://dx.doi.org/10.1186/s12859-018-2201-7 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Wittich, Hans Christian Seeland, Marco Wäldchen, Jana Rzanny, Michael Mäder, Patrick Recommending plant taxa for supporting on-site species identification |
title | Recommending plant taxa for supporting on-site species identification |
title_full | Recommending plant taxa for supporting on-site species identification |
title_fullStr | Recommending plant taxa for supporting on-site species identification |
title_full_unstemmed | Recommending plant taxa for supporting on-site species identification |
title_short | Recommending plant taxa for supporting on-site species identification |
title_sort | recommending plant taxa for supporting on-site species identification |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5975699/ https://www.ncbi.nlm.nih.gov/pubmed/29843588 http://dx.doi.org/10.1186/s12859-018-2201-7 |
work_keys_str_mv | AT wittichhanschristian recommendingplanttaxaforsupportingonsitespeciesidentification AT seelandmarco recommendingplanttaxaforsupportingonsitespeciesidentification AT waldchenjana recommendingplanttaxaforsupportingonsitespeciesidentification AT rzannymichael recommendingplanttaxaforsupportingonsitespeciesidentification AT maderpatrick recommendingplanttaxaforsupportingonsitespeciesidentification |