Cargando…
The ATLAS Production System Evolution
The second generation of the ATLAS Production System called ProdSys2 is a distributed workload manager that runs daily hundreds of thousands of jobs, from dozens of different ATLAS-specific workflows, across more than a hundred heterogeneous sites. It achieves high utilization by combining dynamic j...
Autores principales: | , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2285136 |
_version_ | 1780955864958500864 |
---|---|
author | Borodin, Mikhail Barreiro Megino, Fernando Harald De, Kaushik Golubkov, Dmitry Klimentov, Alexei Korchuganova, Tatiana Padolski, Siarhei Maeno, Tadashi Nilsson, Paul Wenaus, Torre |
author_facet | Borodin, Mikhail Barreiro Megino, Fernando Harald De, Kaushik Golubkov, Dmitry Klimentov, Alexei Korchuganova, Tatiana Padolski, Siarhei Maeno, Tadashi Nilsson, Paul Wenaus, Torre |
author_sort | Borodin, Mikhail |
collection | CERN |
description | The second generation of the ATLAS Production System called ProdSys2 is a distributed workload manager that runs daily hundreds of thousands of jobs, from dozens of different ATLAS-specific workflows, across more than a hundred heterogeneous sites. It achieves high utilization by combining dynamic job definition based upon many criteria, such as input and output size, memory requirements and CPU consumption, with manageable scheduling policies and by supporting different kinds of computational resources, such as GRID, clouds, supercomputers and volunteer computers. The system dynamically assigns a group of jobs (task) to a group of geographically distributed computing resources. Dynamic assignment and resource utilization is one of the major features of the system. The Production System has a sophisticated job fault recovery mechanism, which efficiently allows running multi-terabyte tasks without human intervention. We have implemented new features which allow automatic task submission and chaining of different types of production. We present recent improvements of the ATLAS Production System and its major components: task definition and web user interface. We also report the performance of the designed system and how various workflows, such as data (re)processing, Monte Carlo and physics group production, and user analysis, are scheduled and executed within one production system on heterogeneous computing resources. |
id | cern-2285136 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | cern-22851362019-09-30T06:29:59Zhttp://cds.cern.ch/record/2285136engBorodin, MikhailBarreiro Megino, Fernando HaraldDe, KaushikGolubkov, DmitryKlimentov, AlexeiKorchuganova, TatianaPadolski, SiarheiMaeno, TadashiNilsson, PaulWenaus, TorreThe ATLAS Production System EvolutionParticle Physics - ExperimentThe second generation of the ATLAS Production System called ProdSys2 is a distributed workload manager that runs daily hundreds of thousands of jobs, from dozens of different ATLAS-specific workflows, across more than a hundred heterogeneous sites. It achieves high utilization by combining dynamic job definition based upon many criteria, such as input and output size, memory requirements and CPU consumption, with manageable scheduling policies and by supporting different kinds of computational resources, such as GRID, clouds, supercomputers and volunteer computers. The system dynamically assigns a group of jobs (task) to a group of geographically distributed computing resources. Dynamic assignment and resource utilization is one of the major features of the system. The Production System has a sophisticated job fault recovery mechanism, which efficiently allows running multi-terabyte tasks without human intervention. We have implemented new features which allow automatic task submission and chaining of different types of production. We present recent improvements of the ATLAS Production System and its major components: task definition and web user interface. We also report the performance of the designed system and how various workflows, such as data (re)processing, Monte Carlo and physics group production, and user analysis, are scheduled and executed within one production system on heterogeneous computing resources.ATL-SOFT-SLIDE-2017-799oai:cds.cern.ch:22851362017-09-21 |
spellingShingle | Particle Physics - Experiment Borodin, Mikhail Barreiro Megino, Fernando Harald De, Kaushik Golubkov, Dmitry Klimentov, Alexei Korchuganova, Tatiana Padolski, Siarhei Maeno, Tadashi Nilsson, Paul Wenaus, Torre The ATLAS Production System Evolution |
title | The ATLAS Production System Evolution |
title_full | The ATLAS Production System Evolution |
title_fullStr | The ATLAS Production System Evolution |
title_full_unstemmed | The ATLAS Production System Evolution |
title_short | The ATLAS Production System Evolution |
title_sort | atlas production system evolution |
topic | Particle Physics - Experiment |
url | http://cds.cern.ch/record/2285136 |
work_keys_str_mv | AT borodinmikhail theatlasproductionsystemevolution AT barreiromeginofernandoharald theatlasproductionsystemevolution AT dekaushik theatlasproductionsystemevolution AT golubkovdmitry theatlasproductionsystemevolution AT klimentovalexei theatlasproductionsystemevolution AT korchuganovatatiana theatlasproductionsystemevolution AT padolskisiarhei theatlasproductionsystemevolution AT maenotadashi theatlasproductionsystemevolution AT nilssonpaul theatlasproductionsystemevolution AT wenaustorre theatlasproductionsystemevolution AT borodinmikhail atlasproductionsystemevolution AT barreiromeginofernandoharald atlasproductionsystemevolution AT dekaushik atlasproductionsystemevolution AT golubkovdmitry atlasproductionsystemevolution AT klimentovalexei atlasproductionsystemevolution AT korchuganovatatiana atlasproductionsystemevolution AT padolskisiarhei atlasproductionsystemevolution AT maenotadashi atlasproductionsystemevolution AT nilssonpaul atlasproductionsystemevolution AT wenaustorre atlasproductionsystemevolution |