Cargando…
Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data in...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351198/ https://www.ncbi.nlm.nih.gov/pubmed/25742012 http://dx.doi.org/10.1371/journal.pone.0116781 |
_version_ | 1782360300860735488 |
---|---|
author | Li, Zhenlong Yang, Chaowei Jin, Baoxuan Yu, Manzhu Liu, Kai Sun, Min Zhan, Matthew |
author_facet | Li, Zhenlong Yang, Chaowei Jin, Baoxuan Yu, Manzhu Liu, Kai Sun, Min Zhan, Matthew |
author_sort | Li, Zhenlong |
collection | PubMed |
description | Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists. |
format | Online Article Text |
id | pubmed-4351198 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-43511982015-03-17 Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework Li, Zhenlong Yang, Chaowei Jin, Baoxuan Yu, Manzhu Liu, Kai Sun, Min Zhan, Matthew PLoS One Research Article Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists. Public Library of Science 2015-03-05 /pmc/articles/PMC4351198/ /pubmed/25742012 http://dx.doi.org/10.1371/journal.pone.0116781 Text en © 2015 Li et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Li, Zhenlong Yang, Chaowei Jin, Baoxuan Yu, Manzhu Liu, Kai Sun, Min Zhan, Matthew Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework |
title | Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework |
title_full | Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework |
title_fullStr | Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework |
title_full_unstemmed | Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework |
title_short | Enabling Big Geoscience Data Analytics with a Cloud-Based, MapReduce-Enabled and Service-Oriented Workflow Framework |
title_sort | enabling big geoscience data analytics with a cloud-based, mapreduce-enabled and service-oriented workflow framework |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4351198/ https://www.ncbi.nlm.nih.gov/pubmed/25742012 http://dx.doi.org/10.1371/journal.pone.0116781 |
work_keys_str_mv | AT lizhenlong enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework AT yangchaowei enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework AT jinbaoxuan enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework AT yumanzhu enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework AT liukai enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework AT sunmin enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework AT zhanmatthew enablingbiggeosciencedataanalyticswithacloudbasedmapreduceenabledandserviceorientedworkflowframework |