Cargando…
Multilevel Workflow System in the ATLAS Experiment
The ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distri...
Autores principales: | , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2014
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/608/1/012015 http://cds.cern.ch/record/1950368 |
_version_ | 1780944209419698176 |
---|---|
author | Borodin, M Garcia Navarro, J Golubkov, D Klimentov, A Maeno, T Vaniachine, A |
author_facet | Borodin, M Garcia Navarro, J Golubkov, D Klimentov, A Maeno, T Vaniachine, A |
author_sort | Borodin, M |
collection | CERN |
description | The ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distributed computing, with about 0.8M tasks processed per year. In order to manage the diversity of LHC physics (exceeding 35K physics samples per year), the individual data processing tasks are organized into workflows. For example, the Monte Carlo workflow is composed of many steps: generate or configure hard-processes, hadronize signal and minimum-bias (pileup) events, simulate energy deposition in the ATLAS detector, digitize electronics response, simulate triggers, reconstruct data, convert the reconstructed data into ROOT ntuples for physics analysis, etc. Outputs are merged and/or filtered as necessary to optimize the chain. The bi-level workflow manager - ProdSys2 - generates actual workflow tasks and their jobs are executed across more than a hundred distributed computing sites by PanDA – the ATLAS job-level workload management system. On the outer level, the Database Engine for Tasks (DEfT) empowers production managers with templated workflow definitions. On the next level, the Job Execution and Definition Interface (JEDI) is integrated with PanDA to provide dynamic job definition tailored to the sites capabilities. We report on scaling up the production system to accommodate a growing number of requirements from main ATLAS areas: Trigger, Physics and Data Preparation. |
id | cern-1950368 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2014 |
record_format | invenio |
spelling | cern-19503682019-09-30T06:29:59Zdoi:10.1088/1742-6596/608/1/012015http://cds.cern.ch/record/1950368engBorodin, MGarcia Navarro, JGolubkov, DKlimentov, AMaeno, TVaniachine, AMultilevel Workflow System in the ATLAS ExperimentParticle Physics - ExperimentThe ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distributed computing, with about 0.8M tasks processed per year. In order to manage the diversity of LHC physics (exceeding 35K physics samples per year), the individual data processing tasks are organized into workflows. For example, the Monte Carlo workflow is composed of many steps: generate or configure hard-processes, hadronize signal and minimum-bias (pileup) events, simulate energy deposition in the ATLAS detector, digitize electronics response, simulate triggers, reconstruct data, convert the reconstructed data into ROOT ntuples for physics analysis, etc. Outputs are merged and/or filtered as necessary to optimize the chain. The bi-level workflow manager - ProdSys2 - generates actual workflow tasks and their jobs are executed across more than a hundred distributed computing sites by PanDA – the ATLAS job-level workload management system. On the outer level, the Database Engine for Tasks (DEfT) empowers production managers with templated workflow definitions. On the next level, the Job Execution and Definition Interface (JEDI) is integrated with PanDA to provide dynamic job definition tailored to the sites capabilities. We report on scaling up the production system to accommodate a growing number of requirements from main ATLAS areas: Trigger, Physics and Data Preparation.ATL-SOFT-PROC-2014-005oai:cds.cern.ch:19503682014-09-25 |
spellingShingle | Particle Physics - Experiment Borodin, M Garcia Navarro, J Golubkov, D Klimentov, A Maeno, T Vaniachine, A Multilevel Workflow System in the ATLAS Experiment |
title | Multilevel Workflow System in the ATLAS Experiment |
title_full | Multilevel Workflow System in the ATLAS Experiment |
title_fullStr | Multilevel Workflow System in the ATLAS Experiment |
title_full_unstemmed | Multilevel Workflow System in the ATLAS Experiment |
title_short | Multilevel Workflow System in the ATLAS Experiment |
title_sort | multilevel workflow system in the atlas experiment |
topic | Particle Physics - Experiment |
url | https://dx.doi.org/10.1088/1742-6596/608/1/012015 http://cds.cern.ch/record/1950368 |
work_keys_str_mv | AT borodinm multilevelworkflowsystemintheatlasexperiment AT garcianavarroj multilevelworkflowsystemintheatlasexperiment AT golubkovd multilevelworkflowsystemintheatlasexperiment AT klimentova multilevelworkflowsystemintheatlasexperiment AT maenot multilevelworkflowsystemintheatlasexperiment AT vaniachinea multilevelworkflowsystemintheatlasexperiment |