Cargando…
Big Data processing experience in the ATLAS experiment
To improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2014
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/1668914 |
_version_ | 1780935520693518336 |
---|---|
author | Vaniachine, A |
author_facet | Vaniachine, A |
author_sort | Vaniachine, A |
collection | CERN |
description | To improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of job slots for a faster throughput. High throughput is critical for timely completion of the reprocessing campaigns conducted in preparation for major physics conferences. In 2011 reprocessing the throughput doubled in comparison to the 2010 reprocessing campaign. To deliver new physics results for the 2013 Moriond Conference, ATLAS reprocessed twice more data in November 2012 within the same time period as in 2011 reprocessing, while due to increased LHC pileup, the 2012 pp events required twice more time to reconstruct than 2011 events. For a faster throughput, the number of jobs running concurrently exceeded 33k during ATLAS reprocessing campaign in November 2012. For comparison the daily average number of running jobs remained below 20k during the “legacy” reprocessing of 2012 pp data conducted by the CMS experiment in January-March 2013. The demands on Grid computing resources grow, as scheduled LHC upgrades will increase the data taking rates tenfold. Since a tenfold increase in WLCG resources is not an option, a comprehensive model for the composition and execution of the data processing workflow within given CPU and storage constraints is necessary to accommodate physics needs of the next LHC run. We will report on experience gained in ATLAS Big Data processing, and on efforts underway to scale up Grid data processing beyond petabytes. |
id | cern-1668914 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2014 |
record_format | invenio |
spelling | cern-16689142019-09-30T06:29:59Zhttp://cds.cern.ch/record/1668914engVaniachine, ABig Data processing experience in the ATLAS experimentDetectors and Experimental TechniquesTo improve the data quality for physics analysis, the ATLAS collaboration completed three major data reprocessing campaigns on the Grid during 2010-2012, with up to 2 PB of data being reprocessed every year. The Worldwide LHC Computing Grid provided petabytes of disk storage and tens of thousands of job slots for a faster throughput. High throughput is critical for timely completion of the reprocessing campaigns conducted in preparation for major physics conferences. In 2011 reprocessing the throughput doubled in comparison to the 2010 reprocessing campaign. To deliver new physics results for the 2013 Moriond Conference, ATLAS reprocessed twice more data in November 2012 within the same time period as in 2011 reprocessing, while due to increased LHC pileup, the 2012 pp events required twice more time to reconstruct than 2011 events. For a faster throughput, the number of jobs running concurrently exceeded 33k during ATLAS reprocessing campaign in November 2012. For comparison the daily average number of running jobs remained below 20k during the “legacy” reprocessing of 2012 pp data conducted by the CMS experiment in January-March 2013. The demands on Grid computing resources grow, as scheduled LHC upgrades will increase the data taking rates tenfold. Since a tenfold increase in WLCG resources is not an option, a comprehensive model for the composition and execution of the data processing workflow within given CPU and storage constraints is necessary to accommodate physics needs of the next LHC run. We will report on experience gained in ATLAS Big Data processing, and on efforts underway to scale up Grid data processing beyond petabytes.ATL-SOFT-SLIDE-2014-112oai:cds.cern.ch:16689142014-03-14 |
spellingShingle | Detectors and Experimental Techniques Vaniachine, A Big Data processing experience in the ATLAS experiment |
title | Big Data processing experience in the ATLAS experiment |
title_full | Big Data processing experience in the ATLAS experiment |
title_fullStr | Big Data processing experience in the ATLAS experiment |
title_full_unstemmed | Big Data processing experience in the ATLAS experiment |
title_short | Big Data processing experience in the ATLAS experiment |
title_sort | big data processing experience in the atlas experiment |
topic | Detectors and Experimental Techniques |
url | http://cds.cern.ch/record/1668914 |
work_keys_str_mv | AT vaniachinea bigdataprocessingexperienceintheatlasexperiment |