Cargando…

Big Data processing experience in the ATLAS experiment

To improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of...

Descripción completa

Detalles Bibliográficos
Autor principal:	Vaniachine, A
Lenguaje:	eng
Publicado:	2014
Materias:	Detectors and Experimental Techniques
Acceso en línea:	http://cds.cern.ch/record/1668914

_version_	1780935520693518336
author	Vaniachine, A
author_facet	Vaniachine, A
author_sort	Vaniachine, A
collection	CERN
description	To improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of job slots for a faster throughput. High throughput is critical for timely completion of the reprocessing campaigns conducted in preparation for major physics conferences. In 2011 reprocessing the throughput doubled in comparison to the 2010 reprocessing campaign. To deliver new physics results for the 2013 Moriond Conference, ATLAS reprocessed twice more data in November 2012 within the same time period as in 2011 reprocessing, while due to increased LHC pileup, the 2012 pp events required twice more time to reconstruct than 2011 events. For a faster throughput, the number of jobs running concurrently exceeded 33k during ATLAS reprocessing campaign in November 2012. For comparison the daily average number of running jobs remained below 20k during the “legacy” reprocessing of 2012 pp data conducted by the CMS experiment in January-March 2013. The demands on Grid computing resources grow, as scheduled LHC upgrades will increase the data taking rates tenfold. Since a tenfold increase in WLCG resources is not an option, a comprehensive model for the composition and execution of the data processing workflow within given CPU and storage constraints is necessary to accommodate physics needs of the next LHC run. We will report on experience gained in ATLAS Big Data processing, and on efforts underway to scale up Grid data processing beyond petabytes.
id	cern-1668914
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2014
record_format	invenio
spelling	cern-16689142019-09-30T06:29:59Zhttp://cds.cern.ch/record/1668914engVaniachine, ABig Data processing experience in the ATLAS experimentDetectors and Experimental TechniquesTo improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of job slots for a faster throughput. High throughput is critical for timely completion of the reprocessing campaigns conducted in preparation for major physics conferences. In 2011 reprocessing the throughput doubled in comparison to the 2010 reprocessing campaign. To deliver new physics results for the 2013 Moriond Conference, ATLAS reprocessed twice more data in November 2012 within the same time period as in 2011 reprocessing, while due to increased LHC pileup, the 2012 pp events required twice more time to reconstruct than 2011 events. For a faster throughput, the number of jobs running concurrently exceeded 33k during ATLAS reprocessing campaign in November 2012. For comparison the daily average number of running jobs remained below 20k during the “legacy” reprocessing of 2012 pp data conducted by the CMS experiment in January-March 2013. The demands on Grid computing resources grow, as scheduled LHC upgrades will increase the data taking rates tenfold. Since a tenfold increase in WLCG resources is not an option, a comprehensive model for the composition and execution of the data processing workflow within given CPU and storage constraints is necessary to accommodate physics needs of the next LHC run. We will report on experience gained in ATLAS Big Data processing, and on efforts underway to scale up Grid data processing beyond petabytes.ATL-SOFT-SLIDE-2014-112oai:cds.cern.ch:16689142014-03-14
spellingShingle	Detectors and Experimental Techniques Vaniachine, A Big Data processing experience in the ATLAS experiment
title	Big Data processing experience in the ATLAS experiment
title_full	Big Data processing experience in the ATLAS experiment
title_fullStr	Big Data processing experience in the ATLAS experiment
title_full_unstemmed	Big Data processing experience in the ATLAS experiment
title_short	Big Data processing experience in the ATLAS experiment
title_sort	big data processing experience in the atlas experiment
topic	Detectors and Experimental Techniques
url	http://cds.cern.ch/record/1668914
work_keys_str_mv	AT vaniachinea bigdataprocessingexperienceintheatlasexperiment

Big Data processing experience in the ATLAS experiment

Ejemplares similares