Cargando…

Implementing data placement strategies for the CMS experiment based on a popularity mode

During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of data and processed and analyzed it on the distributed, multi-tiered computing infrastructure on the WorldWide LHC Computing Grid. Given the increasing data volume that has to be stored...

Descripción completa

Detalles Bibliográficos
Autores principales:	Giordano, Domenico, Barreiro Megino, Fernando Harald
Lenguaje:	eng
Publicado:	2012
Materias:	Conferences
Acceso en línea:	http://cds.cern.ch/record/1460885

_version_	1780925266624774144
author	Giordano, Domenico Barreiro Megino, Fernando Harald
author_facet	Giordano, Domenico Barreiro Megino, Fernando Harald
author_sort	Giordano, Domenico
collection	CERN
description	<!--HTML-->During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of data and processed and analyzed it on the distributed, multi-tiered computing infrastructure on the WorldWide LHC Computing Grid. Given the increasing data volume that has to be stored and efficiently analyzed, it is a challenge for several LHC experiments to optimize and automate the data placement strategies in order to fully profit of the available network and storage resources and to facilitate daily computing operations. Building on previous experience acquired by ATLAS, we have developed the CMS Popularity Service that tracks file accesses and user activity on the grid and will serve as the foundation for the evolution of their data placement. A fully automated, popularity-based site-cleaning agent has been deployed in order to scan Tier2 sites that are reaching their space quota and suggest obsolete, unused data that can be safely deleted without disrupting analysis activity. Future work will be to demonstrate dynamic data placement functionality based on this popularity service and integrate it in the data and workload management systems: as a consequence the pre-placement of data will be minimized and additional replication of hot datasets will be requested automatically. This paper will give an insight into the development, validation and production process and will analyze how the framework has influenced resource optimization and daily operations in CMS.
id	cern-1460885
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2012
record_format	invenio
spelling	cern-14608852022-11-02T22:23:31Zhttp://cds.cern.ch/record/1460885engGiordano, DomenicoBarreiro Megino, Fernando HaraldImplementing data placement strategies for the CMS experiment based on a popularity modeComputing in High Energy and Nuclear Physics (CHEP) 2012Conferences<!--HTML-->During the first two years of data taking, the CMS experiment has collected over 20 PetaBytes of data and processed and analyzed it on the distributed, multi-tiered computing infrastructure on the WorldWide LHC Computing Grid. Given the increasing data volume that has to be stored and efficiently analyzed, it is a challenge for several LHC experiments to optimize and automate the data placement strategies in order to fully profit of the available network and storage resources and to facilitate daily computing operations. Building on previous experience acquired by ATLAS, we have developed the CMS Popularity Service that tracks file accesses and user activity on the grid and will serve as the foundation for the evolution of their data placement. A fully automated, popularity-based site-cleaning agent has been deployed in order to scan Tier2 sites that are reaching their space quota and suggest obsolete, unused data that can be safely deleted without disrupting analysis activity. Future work will be to demonstrate dynamic data placement functionality based on this popularity service and integrate it in the data and workload management systems: as a consequence the pre-placement of data will be minimized and additional replication of hot datasets will be requested automatically. This paper will give an insight into the development, validation and production process and will analyze how the framework has influenced resource optimization and daily operations in CMS.oai:cds.cern.ch:14608852012
spellingShingle	Conferences Giordano, Domenico Barreiro Megino, Fernando Harald Implementing data placement strategies for the CMS experiment based on a popularity mode
title	Implementing data placement strategies for the CMS experiment based on a popularity mode
title_full	Implementing data placement strategies for the CMS experiment based on a popularity mode
title_fullStr	Implementing data placement strategies for the CMS experiment based on a popularity mode
title_full_unstemmed	Implementing data placement strategies for the CMS experiment based on a popularity mode
title_short	Implementing data placement strategies for the CMS experiment based on a popularity mode
title_sort	implementing data placement strategies for the cms experiment based on a popularity mode
topic	Conferences
url	http://cds.cern.ch/record/1460885
work_keys_str_mv	AT giordanodomenico implementingdataplacementstrategiesforthecmsexperimentbasedonapopularitymode AT barreiromeginofernandoharald implementingdataplacementstrategiesforthecmsexperimentbasedonapopularitymode AT giordanodomenico computinginhighenergyandnuclearphysicschep2012 AT barreiromeginofernandoharald computinginhighenergyandnuclearphysicschep2012

Implementing data placement strategies for the CMS experiment based on a popularity mode

Ejemplares similares