Cargando…

Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's

The ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shapi...

Descripción completa

Detalles Bibliográficos
Autores principales: Benjamin, Douglas, Maeno, Tadashi, Nilsson, Paul, Tsulaia, Vakhtang, Guan, Wen, Oleynik, Danila, Javurkova, Martina, Magini, Nicolo, Childers, John Taylor
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2696330
_version_ 1780964173850607616
author Benjamin, Douglas
Maeno, Tadashi
Nilsson, Paul
Tsulaia, Vakhtang
Guan, Wen
Oleynik, Danila
Javurkova, Martina
Magini, Nicolo
Childers, John Taylor
author_facet Benjamin, Douglas
Maeno, Tadashi
Nilsson, Paul
Tsulaia, Vakhtang
Guan, Wen
Oleynik, Danila
Javurkova, Martina
Magini, Nicolo
Childers, John Taylor
author_sort Benjamin, Douglas
collection CERN
description The ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shaping. In order to run effectively on the largest HPC machines, ATLAS develop Yoda-Droid software to orchestrate the MPI communication between Harvester and the simulation payload running on over 1000 nodes simultaneously. In this way over 130,000 cores can simultaneously produce simulated Monte Carlo events for ATLAS. The PanDA system also had to be changed to produce "jumbo jobs" capable of simulated over 1 Million events per submission to the local HPC scheduling systems. This presentation will describe in detail the changes to PanDA to enable jumbo jobs and the Yoda-Droid software. Scaling and efficiency measurements will be presented. Results from deployment, integration and operation of the new software at the Titan, Cori and Theta HPC machines will be shown.
id cern-2696330
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26963302019-10-25T19:23:43Zhttp://cds.cern.ch/record/2696330engBenjamin, DouglasMaeno, TadashiNilsson, PaulTsulaia, VakhtangGuan, WenOleynik, DanilaJavurkova, MartinaMagini, NicoloChilders, John TaylorLarge scale fine grain simulation workflows ("Jumbo Jobs") on HPC'sParticle Physics - ExperimentThe ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shaping. In order to run effectively on the largest HPC machines, ATLAS develop Yoda-Droid software to orchestrate the MPI communication between Harvester and the simulation payload running on over 1000 nodes simultaneously. In this way over 130,000 cores can simultaneously produce simulated Monte Carlo events for ATLAS. The PanDA system also had to be changed to produce "jumbo jobs" capable of simulated over 1 Million events per submission to the local HPC scheduling systems. This presentation will describe in detail the changes to PanDA to enable jumbo jobs and the Yoda-Droid software. Scaling and efficiency measurements will be presented. Results from deployment, integration and operation of the new software at the Titan, Cori and Theta HPC machines will be shown.ATL-SOFT-SLIDE-2019-807oai:cds.cern.ch:26963302019-10-25
spellingShingle Particle Physics - Experiment
Benjamin, Douglas
Maeno, Tadashi
Nilsson, Paul
Tsulaia, Vakhtang
Guan, Wen
Oleynik, Danila
Javurkova, Martina
Magini, Nicolo
Childers, John Taylor
Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
title Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
title_full Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
title_fullStr Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
title_full_unstemmed Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
title_short Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
title_sort large scale fine grain simulation workflows ("jumbo jobs") on hpc's
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2696330
work_keys_str_mv AT benjamindouglas largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT maenotadashi largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT nilssonpaul largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT tsulaiavakhtang largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT guanwen largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT oleynikdanila largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT javurkovamartina largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT magininicolo largescalefinegrainsimulationworkflowsjumbojobsonhpcs
AT childersjohntaylor largescalefinegrainsimulationworkflowsjumbojobsonhpcs