Cargando…

Big Data processing experience in the ATLAS experiment

To improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of...

Descripción completa

Detalles Bibliográficos
Autor principal: Vaniachine, A
Lenguaje:eng
Publicado: 2014
Materias:
Acceso en línea:http://cds.cern.ch/record/1668914
_version_ 1780935520693518336
author Vaniachine, A
author_facet Vaniachine, A
author_sort Vaniachine, A
collection CERN
description To improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of job slots for a faster throughput. High throughput is critical for timely completion of the reprocessing campaigns conducted in preparation for major physics conferences. In 2011 reprocessing the throughput doubled in comparison to the 2010 reprocessing campaign. To deliver new physics results for the 2013 Moriond Conference, ATLAS reprocessed twice more data in November 2012 within the same time period as in 2011 reprocessing, while due to increased LHC pileup, the 2012 pp events required twice more time to reconstruct than 2011 events. For a faster throughput, the number of jobs running concurrently exceeded 33k during ATLAS reprocessing campaign in November 2012. For comparison the daily average number of running jobs remained below 20k during the “legacy” reprocessing of 2012 pp data conducted by the CMS experiment in January-March 2013. The demands on Grid computing resources grow, as scheduled LHC upgrades will increase the data taking rates tenfold. Since a tenfold increase in WLCG resources is not an option, a comprehensive model for the composition and execution of the data processing workflow within given CPU and storage constraints is necessary to accommodate physics needs of the next LHC run. We will report on experience gained in ATLAS Big Data processing, and on efforts underway to scale up Grid data processing beyond petabytes.
id cern-1668914
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2014
record_format invenio
spelling cern-16689142019-09-30T06:29:59Zhttp://cds.cern.ch/record/1668914engVaniachine, ABig Data processing experience in the ATLAS experimentDetectors and Experimental TechniquesTo improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of job slots for a faster throughput. High throughput is critical for timely completion of the reprocessing campaigns conducted in preparation for major physics conferences. In 2011 reprocessing the throughput doubled in comparison to the 2010 reprocessing campaign. To deliver new physics results for the 2013 Moriond Conference, ATLAS reprocessed twice more data in November 2012 within the same time period as in 2011 reprocessing, while due to increased LHC pileup, the 2012 pp events required twice more time to reconstruct than 2011 events. For a faster throughput, the number of jobs running concurrently exceeded 33k during ATLAS reprocessing campaign in November 2012. For comparison the daily average number of running jobs remained below 20k during the “legacy” reprocessing of 2012 pp data conducted by the CMS experiment in January-March 2013. The demands on Grid computing resources grow, as scheduled LHC upgrades will increase the data taking rates tenfold. Since a tenfold increase in WLCG resources is not an option, a comprehensive model for the composition and execution of the data processing workflow within given CPU and storage constraints is necessary to accommodate physics needs of the next LHC run. We will report on experience gained in ATLAS Big Data processing, and on efforts underway to scale up Grid data processing beyond petabytes.ATL-SOFT-SLIDE-2014-112oai:cds.cern.ch:16689142014-03-14
spellingShingle Detectors and Experimental Techniques
Vaniachine, A
Big Data processing experience in the ATLAS experiment
title Big Data processing experience in the ATLAS experiment
title_full Big Data processing experience in the ATLAS experiment
title_fullStr Big Data processing experience in the ATLAS experiment
title_full_unstemmed Big Data processing experience in the ATLAS experiment
title_short Big Data processing experience in the ATLAS experiment
title_sort big data processing experience in the atlas experiment
topic Detectors and Experimental Techniques
url http://cds.cern.ch/record/1668914
work_keys_str_mv AT vaniachinea bigdataprocessingexperienceintheatlasexperiment