Cargando…
A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control
Abstract. The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Da...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Pensoft Publishers
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267104/ https://www.ncbi.nlm.nih.gov/pubmed/25535486 http://dx.doi.org/10.3897/BDJ.2.e4221 |
_version_ | 1782349102481145856 |
---|---|
author | Mathew, Cherian Güntsch, Anton Obst, Matthias Vicario, Saverio Haines, Robert Williams, Alan R. de Jong, Yde Goble, Carole |
author_facet | Mathew, Cherian Güntsch, Anton Obst, Matthias Vicario, Saverio Haines, Robert Williams, Alan R. de Jong, Yde Goble, Carole |
author_sort | Mathew, Cherian |
collection | PubMed |
description | Abstract. The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Data Refinement Workflow which integrates taxonomic data retrieval, data cleaning, and data selection into a consistent, standards-based, and effective system hiding the complexity of underlying service infrastructures. The workflow can be freely used both locally and through a web-portal which does not require additional software installations by users. |
format | Online Article Text |
id | pubmed-4267104 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Pensoft Publishers |
record_format | MEDLINE/PubMed |
spelling | pubmed-42671042014-12-22 A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control Mathew, Cherian Güntsch, Anton Obst, Matthias Vicario, Saverio Haines, Robert Williams, Alan R. de Jong, Yde Goble, Carole Biodivers Data J Software Description Abstract. The compilation and cleaning of data needed for analyses and prediction of species distributions is a time consuming process requiring a solid understanding of data formats and service APIs provided by biodiversity informatics infrastructures. We designed and implemented a Taverna-based Data Refinement Workflow which integrates taxonomic data retrieval, data cleaning, and data selection into a consistent, standards-based, and effective system hiding the complexity of underlying service infrastructures. The workflow can be freely used both locally and through a web-portal which does not require additional software installations by users. Pensoft Publishers 2014-12-11 /pmc/articles/PMC4267104/ /pubmed/25535486 http://dx.doi.org/10.3897/BDJ.2.e4221 Text en Cherian Mathew, Anton Güntsch, Matthias Obst, Saverio Vicario, Robert Haines, Alan R. Williams, Yde de Jong, Carole Goble http://creativecommons.org/licenses/by/4.0 This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (CC-BY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Software Description Mathew, Cherian Güntsch, Anton Obst, Matthias Vicario, Saverio Haines, Robert Williams, Alan R. de Jong, Yde Goble, Carole A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
title | A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
title_full | A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
title_fullStr | A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
title_full_unstemmed | A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
title_short | A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
title_sort | semi-automated workflow for biodiversity data retrieval, cleaning, and quality control |
topic | Software Description |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267104/ https://www.ncbi.nlm.nih.gov/pubmed/25535486 http://dx.doi.org/10.3897/BDJ.2.e4221 |
work_keys_str_mv | AT mathewcherian asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT guntschanton asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT obstmatthias asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT vicariosaverio asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT hainesrobert asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT williamsalanr asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT dejongyde asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT goblecarole asemiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT mathewcherian semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT guntschanton semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT obstmatthias semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT vicariosaverio semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT hainesrobert semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT williamsalanr semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT dejongyde semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol AT goblecarole semiautomatedworkflowforbiodiversitydataretrievalcleaningandqualitycontrol |