Cargando…

SEDE-GPS: socio-economic data enrichment based on GPS information

BACKGROUND: Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some e...

Descripción completa

Detalles Bibliográficos
Autores principales: Sperlea, Theodor, Füser, Stefan, Boenigk, Jens, Heider, Dominik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6266930/
https://www.ncbi.nlm.nih.gov/pubmed/30497363
http://dx.doi.org/10.1186/s12859-018-2419-4
_version_ 1783375949501300736
author Sperlea, Theodor
Füser, Stefan
Boenigk, Jens
Heider, Dominik
author_facet Sperlea, Theodor
Füser, Stefan
Boenigk, Jens
Heider, Dominik
author_sort Sperlea, Theodor
collection PubMed
description BACKGROUND: Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental data is, albeit theoretically accessible, not easily available. RESULTS: In this work, we present SEDE-GPS, a tool that gathers data that are relevant to the environment of an user-provided GPS coordinate. To this end, it accesses a list of public and corporate databases and aggregates the information in a single file, which can be used for further analysis. To showcase the use of SEDE-GPS, we enriched a lake microbial ecology sequencing dataset with around 18,000 socio-economic, climate, and geographic features. The sources of SEDE-GPS are public databases such as Eurostat, the Climate Data Center, and OpenStreetMap, as well as corporate sources such as Twitter. Using machine learning and feature selection methods, we were able to identify features in the data provided by SEDE-GPS that can be used to predict lake microbiome alpha diversity. CONCLUSION: The results presented in this study show that SEDE-GPS is a handy and easy-to-use tool for comprehensive data enrichment for studies of ecology and other processes that are affected by environmental features. Furthermore, we present lists of environmental, socio-economic, and climate features that are predictive for microbial biodiversity in lake ecosystems. These lists indicate that human action has a major impact on lake microbiomes. SEDE-GPS and its source code is available for download at http://SEDE-GPS.heiderlab.de ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2419-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6266930
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-62669302018-12-05 SEDE-GPS: socio-economic data enrichment based on GPS information Sperlea, Theodor Füser, Stefan Boenigk, Jens Heider, Dominik BMC Bioinformatics Software BACKGROUND: Microbes are essentail components of all ecosystems because they drive many biochemical processes and act as primary producers. In freshwater ecosystems, the biodiversity in and the composition of microbial communities can be used as indicators for environmental quality. Recently, some environmental features have been identified that influence microbial ecosystems. However, the impact of human action on lake microbiomes is not well understood. This is, in part, due to the fact that environmental data is, albeit theoretically accessible, not easily available. RESULTS: In this work, we present SEDE-GPS, a tool that gathers data that are relevant to the environment of an user-provided GPS coordinate. To this end, it accesses a list of public and corporate databases and aggregates the information in a single file, which can be used for further analysis. To showcase the use of SEDE-GPS, we enriched a lake microbial ecology sequencing dataset with around 18,000 socio-economic, climate, and geographic features. The sources of SEDE-GPS are public databases such as Eurostat, the Climate Data Center, and OpenStreetMap, as well as corporate sources such as Twitter. Using machine learning and feature selection methods, we were able to identify features in the data provided by SEDE-GPS that can be used to predict lake microbiome alpha diversity. CONCLUSION: The results presented in this study show that SEDE-GPS is a handy and easy-to-use tool for comprehensive data enrichment for studies of ecology and other processes that are affected by environmental features. Furthermore, we present lists of environmental, socio-economic, and climate features that are predictive for microbial biodiversity in lake ecosystems. These lists indicate that human action has a major impact on lake microbiomes. SEDE-GPS and its source code is available for download at http://SEDE-GPS.heiderlab.de ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2419-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-11-30 /pmc/articles/PMC6266930/ /pubmed/30497363 http://dx.doi.org/10.1186/s12859-018-2419-4 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Sperlea, Theodor
Füser, Stefan
Boenigk, Jens
Heider, Dominik
SEDE-GPS: socio-economic data enrichment based on GPS information
title SEDE-GPS: socio-economic data enrichment based on GPS information
title_full SEDE-GPS: socio-economic data enrichment based on GPS information
title_fullStr SEDE-GPS: socio-economic data enrichment based on GPS information
title_full_unstemmed SEDE-GPS: socio-economic data enrichment based on GPS information
title_short SEDE-GPS: socio-economic data enrichment based on GPS information
title_sort sede-gps: socio-economic data enrichment based on gps information
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6266930/
https://www.ncbi.nlm.nih.gov/pubmed/30497363
http://dx.doi.org/10.1186/s12859-018-2419-4
work_keys_str_mv AT sperleatheodor sedegpssocioeconomicdataenrichmentbasedongpsinformation
AT fuserstefan sedegpssocioeconomicdataenrichmentbasedongpsinformation
AT boenigkjens sedegpssocioeconomicdataenrichmentbasedongpsinformation
AT heiderdominik sedegpssocioeconomicdataenrichmentbasedongpsinformation