Cargando…

Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques

Hadoop MapReduce reactively detects and recovers faults after they occur based on the static heartbeat detection and the re-execution from scratch techniques. However, these techniques lead to excessive response time penalties and inefficient resource consumption during detection and recovery. Exist...

Descripción completa

Detalles Bibliográficos
Autores principales: Saadoon, Muntadher, Hamid, Siti Hafizah Ab, Sofian, Hazrina, Altarturi, Hamza, Nasuha, Nur, Azizul, Zati Hakim, Sani, Asmiza Abdul, Asemi, Adeleh
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8199096/
https://www.ncbi.nlm.nih.gov/pubmed/34072632
http://dx.doi.org/10.3390/s21113799
_version_ 1783707295898664960
author Saadoon, Muntadher
Hamid, Siti Hafizah Ab
Sofian, Hazrina
Altarturi, Hamza
Nasuha, Nur
Azizul, Zati Hakim
Sani, Asmiza Abdul
Asemi, Adeleh
author_facet Saadoon, Muntadher
Hamid, Siti Hafizah Ab
Sofian, Hazrina
Altarturi, Hamza
Nasuha, Nur
Azizul, Zati Hakim
Sani, Asmiza Abdul
Asemi, Adeleh
author_sort Saadoon, Muntadher
collection PubMed
description Hadoop MapReduce reactively detects and recovers faults after they occur based on the static heartbeat detection and the re-execution from scratch techniques. However, these techniques lead to excessive response time penalties and inefficient resource consumption during detection and recovery. Existing fault-tolerance solutions intend to mitigate the limitations without considering critical conditions such as fail-slow faults, the impact of faults at various infrastructure levels and the relationship between the detection and recovery stages. This paper analyses the response time under two main conditions: fail-stop and fail-slow, when they manifest with node, service, and the task at runtime. In addition, we focus on the relationship between the time for detecting and recovering faults. The experimental analysis is conducted on a real Hadoop cluster comprising MapReduce, YARN and HDFS frameworks. Our analysis shows that the recovery of a single fault leads to an average of 67.6% response time penalty. Even though the detection and recovery times are well-turned, data locality and resource availability must also be considered to obtain the optimum tolerance time and the lowest penalties.
format Online
Article
Text
id pubmed-8199096
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81990962021-06-14 Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques Saadoon, Muntadher Hamid, Siti Hafizah Ab Sofian, Hazrina Altarturi, Hamza Nasuha, Nur Azizul, Zati Hakim Sani, Asmiza Abdul Asemi, Adeleh Sensors (Basel) Article Hadoop MapReduce reactively detects and recovers faults after they occur based on the static heartbeat detection and the re-execution from scratch techniques. However, these techniques lead to excessive response time penalties and inefficient resource consumption during detection and recovery. Existing fault-tolerance solutions intend to mitigate the limitations without considering critical conditions such as fail-slow faults, the impact of faults at various infrastructure levels and the relationship between the detection and recovery stages. This paper analyses the response time under two main conditions: fail-stop and fail-slow, when they manifest with node, service, and the task at runtime. In addition, we focus on the relationship between the time for detecting and recovering faults. The experimental analysis is conducted on a real Hadoop cluster comprising MapReduce, YARN and HDFS frameworks. Our analysis shows that the recovery of a single fault leads to an average of 67.6% response time penalty. Even though the detection and recovery times are well-turned, data locality and resource availability must also be considered to obtain the optimum tolerance time and the lowest penalties. MDPI 2021-05-31 /pmc/articles/PMC8199096/ /pubmed/34072632 http://dx.doi.org/10.3390/s21113799 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Saadoon, Muntadher
Hamid, Siti Hafizah Ab
Sofian, Hazrina
Altarturi, Hamza
Nasuha, Nur
Azizul, Zati Hakim
Sani, Asmiza Abdul
Asemi, Adeleh
Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques
title Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques
title_full Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques
title_fullStr Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques
title_full_unstemmed Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques
title_short Experimental Analysis in Hadoop MapReduce: A Closer Look at Fault Detection and Recovery Techniques
title_sort experimental analysis in hadoop mapreduce: a closer look at fault detection and recovery techniques
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8199096/
https://www.ncbi.nlm.nih.gov/pubmed/34072632
http://dx.doi.org/10.3390/s21113799
work_keys_str_mv AT saadoonmuntadher experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT hamidsitihafizahab experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT sofianhazrina experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT altarturihamza experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT nasuhanur experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT azizulzatihakim experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT saniasmizaabdul experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques
AT asemiadeleh experimentalanalysisinhadoopmapreduceacloserlookatfaultdetectionandrecoverytechniques