Cargando…

Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown

The ATLAS Distributed Computing project (ADC) was established in 2007 to develop and operate a framework, following the ATLAS computing model, to enable data storage, processing and bookkeeping on top of the WLCG distributed infrastructure. ADC development has always been driven by operations and th...

Descripción completa

Detalles Bibliográficos
Autor principal: Campana, S
Lenguaje:eng
Publicado: 2013
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/513/3/032016
http://cds.cern.ch/record/1612191
_version_ 1780932213719695360
author Campana, S
author_facet Campana, S
author_sort Campana, S
collection CERN
description The ATLAS Distributed Computing project (ADC) was established in 2007 to develop and operate a framework, following the ATLAS computing model, to enable data storage, processing and bookkeeping on top of the WLCG distributed infrastructure. ADC development has always been driven by operations and this contributed to its success. The system has fulfilled the demanding requirements of ATLAS, daily consolidating worldwide up to 1PB of data and running more than 1.5 million payloads distributed globally, supporting almost one thousand concurrent distributed analysis users. Comprehensive automation and monitoring minimized the operational manpower required. The flexibility of the system to adjust to operational needs has been important to the success of the ATLAS physics program. The LHC shutdown in 2013-2015 affords an opportunity to improve the system in light of operational experience and scale it to cope with the demanding requirements of 2015 and beyond, most notably a much higher trigger rate and event pileup. We will describe the evolution of the ADC software foreseen during this period. This includes consolidating the existing Production and Distributed Analysis framework (PanDA) and ATLAS Grid Information System (AGIS), together with the development and commissioning of next generation systems for distributed data management (DDM/Rucio) and production (PRODSYS2). We will explain how new technologies such as Cloud Computing and NoSQL databases, which ATLAS investigated as R&D projects in past years, will be integrated in production. Finally, we will describe more fundamental developments such as breaking job-to-data locality by exploiting storage federations and caches, and event level (rather than file or dataset level) workload engines.
id cern-1612191
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2013
record_format invenio
spelling cern-16121912019-09-30T06:29:59Zdoi:10.1088/1742-6596/513/3/032016http://cds.cern.ch/record/1612191engCampana, SEvolution of the ATLAS Distributed Computing system during the LHC Long shutdownDetectors and Experimental TechniquesThe ATLAS Distributed Computing project (ADC) was established in 2007 to develop and operate a framework, following the ATLAS computing model, to enable data storage, processing and bookkeeping on top of the WLCG distributed infrastructure. ADC development has always been driven by operations and this contributed to its success. The system has fulfilled the demanding requirements of ATLAS, daily consolidating worldwide up to 1PB of data and running more than 1.5 million payloads distributed globally, supporting almost one thousand concurrent distributed analysis users. Comprehensive automation and monitoring minimized the operational manpower required. The flexibility of the system to adjust to operational needs has been important to the success of the ATLAS physics program. The LHC shutdown in 2013-2015 affords an opportunity to improve the system in light of operational experience and scale it to cope with the demanding requirements of 2015 and beyond, most notably a much higher trigger rate and event pileup. We will describe the evolution of the ADC software foreseen during this period. This includes consolidating the existing Production and Distributed Analysis framework (PanDA) and ATLAS Grid Information System (AGIS), together with the development and commissioning of next generation systems for distributed data management (DDM/Rucio) and production (PRODSYS2). We will explain how new technologies such as Cloud Computing and NoSQL databases, which ATLAS investigated as R&D projects in past years, will be integrated in production. Finally, we will describe more fundamental developments such as breaking job-to-data locality by exploiting storage federations and caches, and event level (rather than file or dataset level) workload engines.ATL-SOFT-PROC-2013-026oai:cds.cern.ch:16121912013-10-19
spellingShingle Detectors and Experimental Techniques
Campana, S
Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown
title Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown
title_full Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown
title_fullStr Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown
title_full_unstemmed Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown
title_short Evolution of the ATLAS Distributed Computing system during the LHC Long shutdown
title_sort evolution of the atlas distributed computing system during the lhc long shutdown
topic Detectors and Experimental Techniques
url https://dx.doi.org/10.1088/1742-6596/513/3/032016
http://cds.cern.ch/record/1612191
work_keys_str_mv AT campanas evolutionoftheatlasdistributedcomputingsystemduringthelhclongshutdown