Cargando…

Using hadoop file system and MapReduce in a small/medium Grid site

Data storage and data access represent the key of CPU-intensive and data-intensive high performance Grid computing. Hadoop is an open-source data processing framework that includes fault-tolerant and scalable distributed data processing model and execution environment, named MapReduce, and distribut...

Descripción completa

Detalles Bibliográficos
Autores principales:	Riahi, H, Donvito, G, Fano, L, Fasi, M, Marzulli, G, Spiga, D, Valentini, A
Lenguaje:	eng
Publicado:	2012
Materias:	Computing and Computers
Acceso en línea:	https://dx.doi.org/10.1088/1742-6596/396/4/042050 http://cds.cern.ch/record/1565910

_version_	1780930946725314560
author	Riahi, H Donvito, G Fano, L Fasi, M Marzulli, G Spiga, D Valentini, A
author_facet	Riahi, H Donvito, G Fano, L Fasi, M Marzulli, G Spiga, D Valentini, A
author_sort	Riahi, H
collection	CERN
description	Data storage and data access represent the key of CPU-intensive and data-intensive high performance Grid computing. Hadoop is an open-source data processing framework that includes fault-tolerant and scalable distributed data processing model and execution environment, named MapReduce, and distributed File System, named Hadoop distributed File System (HDFS). HDFS was deployed and tested within the Open Science Grid (OSG) middleware stack. Efforts have been taken to integrate HDFS with gLite middleware. We have tested the File System thoroughly in order to understand its scalability and fault-tolerance while dealing with small/medium site environment constraints. To benefit entirely from this File System, we made it working in conjunction with Hadoop Job scheduler to optimize the executions of the local physics analysis workflows. The performance of the analysis jobs which used such architecture seems to be promising, making it useful to follow up in the future.
id	cern-1565910
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2012
record_format	invenio
spelling	cern-15659102022-08-17T13:30:27Zdoi:10.1088/1742-6596/396/4/042050http://cds.cern.ch/record/1565910engRiahi, HDonvito, GFano, LFasi, MMarzulli, GSpiga, DValentini, AUsing hadoop file system and MapReduce in a small/medium Grid siteComputing and ComputersData storage and data access represent the key of CPU-intensive and data-intensive high performance Grid computing. Hadoop is an open-source data processing framework that includes fault-tolerant and scalable distributed data processing model and execution environment, named MapReduce, and distributed File System, named Hadoop distributed File System (HDFS). HDFS was deployed and tested within the Open Science Grid (OSG) middleware stack. Efforts have been taken to integrate HDFS with gLite middleware. We have tested the File System thoroughly in order to understand its scalability and fault-tolerance while dealing with small/medium site environment constraints. To benefit entirely from this File System, we made it working in conjunction with Hadoop Job scheduler to optimize the executions of the local physics analysis workflows. The performance of the analysis jobs which used such architecture seems to be promising, making it useful to follow up in the future.oai:cds.cern.ch:15659102012
spellingShingle	Computing and Computers Riahi, H Donvito, G Fano, L Fasi, M Marzulli, G Spiga, D Valentini, A Using hadoop file system and MapReduce in a small/medium Grid site
title	Using hadoop file system and MapReduce in a small/medium Grid site
title_full	Using hadoop file system and MapReduce in a small/medium Grid site
title_fullStr	Using hadoop file system and MapReduce in a small/medium Grid site
title_full_unstemmed	Using hadoop file system and MapReduce in a small/medium Grid site
title_short	Using hadoop file system and MapReduce in a small/medium Grid site
title_sort	using hadoop file system and mapreduce in a small/medium grid site
topic	Computing and Computers
url	https://dx.doi.org/10.1088/1742-6596/396/4/042050 http://cds.cern.ch/record/1565910
work_keys_str_mv	AT riahih usinghadoopfilesystemandmapreduceinasmallmediumgridsite AT donvitog usinghadoopfilesystemandmapreduceinasmallmediumgridsite AT fanol usinghadoopfilesystemandmapreduceinasmallmediumgridsite AT fasim usinghadoopfilesystemandmapreduceinasmallmediumgridsite AT marzullig usinghadoopfilesystemandmapreduceinasmallmediumgridsite AT spigad usinghadoopfilesystemandmapreduceinasmallmediumgridsite AT valentinia usinghadoopfilesystemandmapreduceinasmallmediumgridsite

Using hadoop file system and MapReduce in a small/medium Grid site

Ejemplares similares