Cargando…
Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's
The ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shapi...
Autores principales: | , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2696330 |
_version_ | 1780964173850607616 |
---|---|
author | Benjamin, Douglas Maeno, Tadashi Nilsson, Paul Tsulaia, Vakhtang Guan, Wen Oleynik, Danila Javurkova, Martina Magini, Nicolo Childers, John Taylor |
author_facet | Benjamin, Douglas Maeno, Tadashi Nilsson, Paul Tsulaia, Vakhtang Guan, Wen Oleynik, Danila Javurkova, Martina Magini, Nicolo Childers, John Taylor |
author_sort | Benjamin, Douglas |
collection | CERN |
description | The ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shaping. In order to run effectively on the largest HPC machines, ATLAS develop Yoda-Droid software to orchestrate the MPI communication between Harvester and the simulation payload running on over 1000 nodes simultaneously. In this way over 130,000 cores can simultaneously produce simulated Monte Carlo events for ATLAS. The PanDA system also had to be changed to produce "jumbo jobs" capable of simulated over 1 Million events per submission to the local HPC scheduling systems. This presentation will describe in detail the changes to PanDA to enable jumbo jobs and the Yoda-Droid software. Scaling and efficiency measurements will be presented. Results from deployment, integration and operation of the new software at the Titan, Cori and Theta HPC machines will be shown. |
id | cern-2696330 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2019 |
record_format | invenio |
spelling | cern-26963302019-10-25T19:23:43Zhttp://cds.cern.ch/record/2696330engBenjamin, DouglasMaeno, TadashiNilsson, PaulTsulaia, VakhtangGuan, WenOleynik, DanilaJavurkova, MartinaMagini, NicoloChilders, John TaylorLarge scale fine grain simulation workflows ("Jumbo Jobs") on HPC'sParticle Physics - ExperimentThe ATLAS experiment is using large High Performance Computers (HPC's) and fine grained simulation workflows (Event Service) to produce fully simulated events in an efficient manner. ATLAS has developed a new software component (Harvester) which provides resource provisioning and workload shaping. In order to run effectively on the largest HPC machines, ATLAS develop Yoda-Droid software to orchestrate the MPI communication between Harvester and the simulation payload running on over 1000 nodes simultaneously. In this way over 130,000 cores can simultaneously produce simulated Monte Carlo events for ATLAS. The PanDA system also had to be changed to produce "jumbo jobs" capable of simulated over 1 Million events per submission to the local HPC scheduling systems. This presentation will describe in detail the changes to PanDA to enable jumbo jobs and the Yoda-Droid software. Scaling and efficiency measurements will be presented. Results from deployment, integration and operation of the new software at the Titan, Cori and Theta HPC machines will be shown.ATL-SOFT-SLIDE-2019-807oai:cds.cern.ch:26963302019-10-25 |
spellingShingle | Particle Physics - Experiment Benjamin, Douglas Maeno, Tadashi Nilsson, Paul Tsulaia, Vakhtang Guan, Wen Oleynik, Danila Javurkova, Martina Magini, Nicolo Childers, John Taylor Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's |
title | Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's |
title_full | Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's |
title_fullStr | Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's |
title_full_unstemmed | Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's |
title_short | Large scale fine grain simulation workflows ("Jumbo Jobs") on HPC's |
title_sort | large scale fine grain simulation workflows ("jumbo jobs") on hpc's |
topic | Particle Physics - Experiment |
url | http://cds.cern.ch/record/2696330 |
work_keys_str_mv | AT benjamindouglas largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT maenotadashi largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT nilssonpaul largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT tsulaiavakhtang largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT guanwen largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT oleynikdanila largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT javurkovamartina largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT magininicolo largescalefinegrainsimulationworkflowsjumbojobsonhpcs AT childersjohntaylor largescalefinegrainsimulationworkflowsjumbojobsonhpcs |