Cargando…

Parallel Processing Strategies for Big Geospatial Data

This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data....

Descripción completa

Detalles Bibliográficos
Autor principal: Werner, Martin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931969/
https://www.ncbi.nlm.nih.gov/pubmed/33693367
http://dx.doi.org/10.3389/fdata.2019.00044
_version_ 1783660394266492928
author Werner, Martin
author_facet Werner, Martin
author_sort Werner, Martin
collection PubMed
description This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data. Furthermore, the paper gives some examples from simple and advanced GIS and spatial data analysis highlighting both that big data systems have been around long before the current hype of big data and that they follow some design principles which are inevitable for spatial data including distributed data structures and messaging, which are, however, incompatible with the popular MapReduce paradigm. Throughout this discussion, the need for a replacement or extension of the MapReduce paradigm for spatial data is derived. This paradigm should be able to deal with the imperfect data locality inherent to spatial data hindering full independence of non-trivial computational tasks. We conclude that more research is needed and that spatial big data systems should pick up more concepts like graphs, shortest paths, raster data, events, and streams at the same time instead of solving exactly the set of spatially separable problems such as line simplifications or range queries in manydifferent ways.
format Online
Article
Text
id pubmed-7931969
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-79319692021-03-09 Parallel Processing Strategies for Big Geospatial Data Werner, Martin Front Big Data Big Data This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data. Furthermore, the paper gives some examples from simple and advanced GIS and spatial data analysis highlighting both that big data systems have been around long before the current hype of big data and that they follow some design principles which are inevitable for spatial data including distributed data structures and messaging, which are, however, incompatible with the popular MapReduce paradigm. Throughout this discussion, the need for a replacement or extension of the MapReduce paradigm for spatial data is derived. This paradigm should be able to deal with the imperfect data locality inherent to spatial data hindering full independence of non-trivial computational tasks. We conclude that more research is needed and that spatial big data systems should pick up more concepts like graphs, shortest paths, raster data, events, and streams at the same time instead of solving exactly the set of spatially separable problems such as line simplifications or range queries in manydifferent ways. Frontiers Media S.A. 2019-12-03 /pmc/articles/PMC7931969/ /pubmed/33693367 http://dx.doi.org/10.3389/fdata.2019.00044 Text en Copyright © 2019 Werner. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Werner, Martin
Parallel Processing Strategies for Big Geospatial Data
title Parallel Processing Strategies for Big Geospatial Data
title_full Parallel Processing Strategies for Big Geospatial Data
title_fullStr Parallel Processing Strategies for Big Geospatial Data
title_full_unstemmed Parallel Processing Strategies for Big Geospatial Data
title_short Parallel Processing Strategies for Big Geospatial Data
title_sort parallel processing strategies for big geospatial data
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7931969/
https://www.ncbi.nlm.nih.gov/pubmed/33693367
http://dx.doi.org/10.3389/fdata.2019.00044
work_keys_str_mv AT wernermartin parallelprocessingstrategiesforbiggeospatialdata