Cargando…
Parallel Processing Strategies for Big Geospatial Data
This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data....
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931969/ https://www.ncbi.nlm.nih.gov/pubmed/33693367 http://dx.doi.org/10.3389/fdata.2019.00044 |
_version_ | 1783660394266492928 |
---|---|
author | Werner, Martin |
author_facet | Werner, Martin |
author_sort | Werner, Martin |
collection | PubMed |
description | This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data. Furthermore, the paper gives some examples from simple and advanced GIS and spatial data analysis highlighting both that big data systems have been around long before the current hype of big data and that they follow some design principles which are inevitable for spatial data including distributed data structures and messaging, which are, however, incompatible with the popular MapReduce paradigm. Throughout this discussion, the need for a replacement or extension of the MapReduce paradigm for spatial data is derived. This paradigm should be able to deal with the imperfect data locality inherent to spatial data hindering full independence of non-trivial computational tasks. We conclude that more research is needed and that spatial big data systems should pick up more concepts like graphs, shortest paths, raster data, events, and streams at the same time instead of solving exactly the set of spatially separable problems such as line simplifications or range queries in manydifferent ways. |
format | Online Article Text |
id | pubmed-7931969 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-79319692021-03-09 Parallel Processing Strategies for Big Geospatial Data Werner, Martin Front Big Data Big Data This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data. Furthermore, the paper gives some examples from simple and advanced GIS and spatial data analysis highlighting both that big data systems have been around long before the current hype of big data and that they follow some design principles which are inevitable for spatial data including distributed data structures and messaging, which are, however, incompatible with the popular MapReduce paradigm. Throughout this discussion, the need for a replacement or extension of the MapReduce paradigm for spatial data is derived. This paradigm should be able to deal with the imperfect data locality inherent to spatial data hindering full independence of non-trivial computational tasks. We conclude that more research is needed and that spatial big data systems should pick up more concepts like graphs, shortest paths, raster data, events, and streams at the same time instead of solving exactly the set of spatially separable problems such as line simplifications or range queries in manydifferent ways. Frontiers Media S.A. 2019-12-03 /pmc/articles/PMC7931969/ /pubmed/33693367 http://dx.doi.org/10.3389/fdata.2019.00044 Text en Copyright © 2019 Werner. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Big Data Werner, Martin Parallel Processing Strategies for Big Geospatial Data |
title | Parallel Processing Strategies for Big Geospatial Data |
title_full | Parallel Processing Strategies for Big Geospatial Data |
title_fullStr | Parallel Processing Strategies for Big Geospatial Data |
title_full_unstemmed | Parallel Processing Strategies for Big Geospatial Data |
title_short | Parallel Processing Strategies for Big Geospatial Data |
title_sort | parallel processing strategies for big geospatial data |
topic | Big Data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931969/ https://www.ncbi.nlm.nih.gov/pubmed/33693367 http://dx.doi.org/10.3389/fdata.2019.00044 |
work_keys_str_mv | AT wernermartin parallelprocessingstrategiesforbiggeospatialdata |