Cargando…

Recent Improvements in the ATLAS PanDA Pilot

The Production and Distributed Analysis system (PanDA) in the ATLAS experiment uses pilots to execute submitted jobs on the worker nodes. The pilots are designed to deal with different runtime conditions and failure scenarios, and support many storage systems. This talk will give a brief overview of...

Descripción completa

Detalles Bibliográficos
Autores principales: Nilsson, P, Caballero Bejar, J, Contreras, C, Compostella, G, De, K, Potekhin, M, Dos Santos, T, Maeno, T, Wenaus, T
Lenguaje:eng
Publicado: 2012
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/396/3/032080
http://cds.cern.ch/record/1450074
Descripción
Sumario:The Production and Distributed Analysis system (PanDA) in the ATLAS experiment uses pilots to execute submitted jobs on the worker nodes. The pilots are designed to deal with different runtime conditions and failure scenarios, and support many storage systems. This talk will give a brief overview of the PanDA pilot system and will present major features and recent improvements including CernVM File System integration, the job retry mechanism, advanced job monitoring including JEM technology, and validation of new pilot code using the HammerCloud stress-testing system. PanDA is used for all ATLAS distributed production and is the primary system for distributed analysis. It is currently used at over 100 sites world-wide. We analyze the performance of the pilot system in processing LHC data on the OSG, EGI and Nordugrid infrastructures used by ATLAS, and describe plans for its further evolution.