Cargando…

Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment

Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Qi, Cai, Weidong, Jin, Dandan, Shen, Jian, Fu, Zhangjie, Liu, Xiaodong, Linge, Nigel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5038664/
https://www.ncbi.nlm.nih.gov/pubmed/27589753
http://dx.doi.org/10.3390/s16091386
_version_ 1782455924379615232
author Liu, Qi
Cai, Weidong
Jin, Dandan
Shen, Jian
Fu, Zhangjie
Liu, Xiaodong
Linge, Nigel
author_facet Liu, Qi
Cai, Weidong
Jin, Dandan
Shen, Jian
Fu, Zhangjie
Liu, Xiaodong
Linge, Nigel
author_sort Liu, Qi
collection PubMed
description Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks’ execution time can be improved, in particular for some regular jobs.
format Online
Article
Text
id pubmed-5038664
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-50386642016-09-29 Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment Liu, Qi Cai, Weidong Jin, Dandan Shen, Jian Fu, Zhangjie Liu, Xiaodong Linge, Nigel Sensors (Basel) Article Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks’ execution time can be improved, in particular for some regular jobs. MDPI 2016-08-30 /pmc/articles/PMC5038664/ /pubmed/27589753 http://dx.doi.org/10.3390/s16091386 Text en © 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Liu, Qi
Cai, Weidong
Jin, Dandan
Shen, Jian
Fu, Zhangjie
Liu, Xiaodong
Linge, Nigel
Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
title Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
title_full Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
title_fullStr Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
title_full_unstemmed Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
title_short Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment
title_sort estimation accuracy on execution time of run-time tasks in a heterogeneous distributed environment
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5038664/
https://www.ncbi.nlm.nih.gov/pubmed/27589753
http://dx.doi.org/10.3390/s16091386
work_keys_str_mv AT liuqi estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment
AT caiweidong estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment
AT jindandan estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment
AT shenjian estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment
AT fuzhangjie estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment
AT liuxiaodong estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment
AT lingenigel estimationaccuracyonexecutiontimeofruntimetasksinaheterogeneousdistributedenvironment