Cargando…

Multilevel Workflow System in the ATLAS Experiment

The ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distri...

Descripción completa

Detalles Bibliográficos
Autores principales: Borodin, M, Garcia Navarro, J, Golubkov, D, Klimentov, A, Maeno, T, Vaniachine, A
Lenguaje:eng
Publicado: 2014
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/608/1/012015
http://cds.cern.ch/record/1950368
_version_ 1780944209419698176
author Borodin, M
Garcia Navarro, J
Golubkov, D
Klimentov, A
Maeno, T
Vaniachine, A
author_facet Borodin, M
Garcia Navarro, J
Golubkov, D
Klimentov, A
Maeno, T
Vaniachine, A
author_sort Borodin, M
collection CERN
description The ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distributed computing, with about 0.8M tasks processed per year. In order to manage the diversity of LHC physics (exceeding 35K physics samples per year), the individual data processing tasks are organized into workflows. For example, the Monte Carlo workflow is composed of many steps: generate or configure hard-processes, hadronize signal and minimum-bias (pileup) events, simulate energy deposition in the ATLAS detector, digitize electronics response, simulate triggers, reconstruct data, convert the reconstructed data into ROOT ntuples for physics analysis, etc. Outputs are merged and/or filtered as necessary to optimize the chain. The bi-level workflow manager - ProdSys2 - generates actual workflow tasks and their jobs are executed across more than a hundred distributed computing sites by PanDA – the ATLAS job-level workload management system. On the outer level, the Database Engine for Tasks (DEfT) empowers production managers with templated workflow definitions. On the next level, the Job Execution and Definition Interface (JEDI) is integrated with PanDA to provide dynamic job definition tailored to the sites capabilities. We report on scaling up the production system to accommodate a growing number of requirements from main ATLAS areas: Trigger, Physics and Data Preparation.
id cern-1950368
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2014
record_format invenio
spelling cern-19503682019-09-30T06:29:59Zdoi:10.1088/1742-6596/608/1/012015http://cds.cern.ch/record/1950368engBorodin, MGarcia Navarro, JGolubkov, DKlimentov, AMaeno, TVaniachine, AMultilevel Workflow System in the ATLAS ExperimentParticle Physics - ExperimentThe ATLAS experiment is scaling up Big Data processing for the next LHC run using a multilevel workflow system comprised of many layers. In Big Data processing ATLAS deals with datasets, not individual files. Similarly a task (comprised of many jobs) has become a unit of the ATLAS workflow in distributed computing, with about 0.8M tasks processed per year. In order to manage the diversity of LHC physics (exceeding 35K physics samples per year), the individual data processing tasks are organized into workflows. For example, the Monte Carlo workflow is composed of many steps: generate or configure hard-processes, hadronize signal and minimum-bias (pileup) events, simulate energy deposition in the ATLAS detector, digitize electronics response, simulate triggers, reconstruct data, convert the reconstructed data into ROOT ntuples for physics analysis, etc. Outputs are merged and/or filtered as necessary to optimize the chain. The bi-level workflow manager - ProdSys2 - generates actual workflow tasks and their jobs are executed across more than a hundred distributed computing sites by PanDA – the ATLAS job-level workload management system. On the outer level, the Database Engine for Tasks (DEfT) empowers production managers with templated workflow definitions. On the next level, the Job Execution and Definition Interface (JEDI) is integrated with PanDA to provide dynamic job definition tailored to the sites capabilities. We report on scaling up the production system to accommodate a growing number of requirements from main ATLAS areas: Trigger, Physics and Data Preparation.ATL-SOFT-PROC-2014-005oai:cds.cern.ch:19503682014-09-25
spellingShingle Particle Physics - Experiment
Borodin, M
Garcia Navarro, J
Golubkov, D
Klimentov, A
Maeno, T
Vaniachine, A
Multilevel Workflow System in the ATLAS Experiment
title Multilevel Workflow System in the ATLAS Experiment
title_full Multilevel Workflow System in the ATLAS Experiment
title_fullStr Multilevel Workflow System in the ATLAS Experiment
title_full_unstemmed Multilevel Workflow System in the ATLAS Experiment
title_short Multilevel Workflow System in the ATLAS Experiment
title_sort multilevel workflow system in the atlas experiment
topic Particle Physics - Experiment
url https://dx.doi.org/10.1088/1742-6596/608/1/012015
http://cds.cern.ch/record/1950368
work_keys_str_mv AT borodinm multilevelworkflowsystemintheatlasexperiment
AT garcianavarroj multilevelworkflowsystemintheatlasexperiment
AT golubkovd multilevelworkflowsystemintheatlasexperiment
AT klimentova multilevelworkflowsystemintheatlasexperiment
AT maenot multilevelworkflowsystemintheatlasexperiment
AT vaniachinea multilevelworkflowsystemintheatlasexperiment