Cargando…

XRootD popularity on hadoop clusters

Performance data and metadata of the computing operations at the CMS experiment are collected through a distributed monitoring infrastructure, currently relying on a traditional Oracle database system. This paper shows how to harness Big Data architectures in order to improve the throughput and the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Meoni, Marco, Boccali, Tommaso, Magini, Nicolò, Menichetti, Luca, Giordano, Domenico
Lenguaje:	eng
Publicado:	2017
Materias:	Computing and Computers
Acceso en línea:	https://dx.doi.org/10.1088/1742-6596/898/7/072027 http://cds.cern.ch/record/2296795

_version_	1780956905373433856
author	Meoni, Marco Boccali, Tommaso Magini, Nicolò Menichetti, Luca Giordano, Domenico
author_facet	Meoni, Marco Boccali, Tommaso Magini, Nicolò Menichetti, Luca Giordano, Domenico
author_sort	Meoni, Marco
collection	CERN
description	Performance data and metadata of the computing operations at the CMS experiment are collected through a distributed monitoring infrastructure, currently relying on a traditional Oracle database system. This paper shows how to harness Big Data architectures in order to improve the throughput and the efficiency of such monitoring. A large set of operational data - user activities, job submissions, resources, file transfers, site efficiencies, software releases, network traffic, machine logs - is being injected into a readily available Hadoop cluster, via several data streamers. The collected metadata is further organized running fast arbitrary queries; this offers the ability to test several Map&Reduce-based; frameworks and measure the system speed-up when compared to the original database infrastructure. By leveraging a quality Hadoop data store and enabling an analytics framework on top, it is possible to design a mining platform to predict dataset popularity and discover patterns and correlations.
id	oai-inspirehep.net-1638557
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2017
record_format	invenio
spelling	oai-inspirehep.net-16385572021-02-09T10:06:26Zdoi:10.1088/1742-6596/898/7/072027http://cds.cern.ch/record/2296795engMeoni, MarcoBoccali, TommasoMagini, NicolòMenichetti, LucaGiordano, DomenicoXRootD popularity on hadoop clustersComputing and ComputersPerformance data and metadata of the computing operations at the CMS experiment are collected through a distributed monitoring infrastructure, currently relying on a traditional Oracle database system. This paper shows how to harness Big Data architectures in order to improve the throughput and the efficiency of such monitoring. A large set of operational data - user activities, job submissions, resources, file transfers, site efficiencies, software releases, network traffic, machine logs - is being injected into a readily available Hadoop cluster, via several data streamers. The collected metadata is further organized running fast arbitrary queries; this offers the ability to test several Map&Reduce-based; frameworks and measure the system speed-up when compared to the original database infrastructure. By leveraging a quality Hadoop data store and enabling an analytics framework on top, it is possible to design a mining platform to predict dataset popularity and discover patterns and correlations.oai:inspirehep.net:16385572017
spellingShingle	Computing and Computers Meoni, Marco Boccali, Tommaso Magini, Nicolò Menichetti, Luca Giordano, Domenico XRootD popularity on hadoop clusters
title	XRootD popularity on hadoop clusters
title_full	XRootD popularity on hadoop clusters
title_fullStr	XRootD popularity on hadoop clusters
title_full_unstemmed	XRootD popularity on hadoop clusters
title_short	XRootD popularity on hadoop clusters
title_sort	xrootd popularity on hadoop clusters
topic	Computing and Computers
url	https://dx.doi.org/10.1088/1742-6596/898/7/072027 http://cds.cern.ch/record/2296795
work_keys_str_mv	AT meonimarco xrootdpopularityonhadoopclusters AT boccalitommaso xrootdpopularityonhadoopclusters AT magininicolo xrootdpopularityonhadoopclusters AT menichettiluca xrootdpopularityonhadoopclusters AT giordanodomenico xrootdpopularityonhadoopclusters

XRootD popularity on hadoop clusters

Ejemplares similares