Cargando…

Software and experience with managing workflows for the computing operation of the CMS experiment

We present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operati...

Descripción completa

Detalles Bibliográficos
Autor principal: Vlimant, Jean-Roch
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/898/5/052025
http://cds.cern.ch/record/2298625
_version_ 1780957026994618368
author Vlimant, Jean-Roch
author_facet Vlimant, Jean-Roch
author_sort Vlimant, Jean-Roch
collection CERN
description We present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operation, including transferring input datasets where relevant and monitoring them, assigning work to computing resources available on the CMS grid, and delivering the output to the Physics groups. Automation is critical above a certain number of requests to be handled, especially in the view of using more efficiently computing resources and reducing latency. An effort to automatize the necessary steps for production and reprocessing recently started and a new system to handle workflows has been developed. The state-machine system described consists in a set of modules whose key feature is the automatic placement of input datasets, balancing the load across multiple sites. By reducing the operation overhead, these agents enable the utilization of more than double the amount of resources with robust storage system. Additional functionality were added after months of successful operation to further balance the load on the computing system using remote read and additional resources. This system contributed to reducing the delivery time of datasets, a crucial aspect to the analysis of CMS data. We report on lessons learned from operation towards increased efficiency in using a largely heterogeneous distributed system of computing, storage and network elements.
id oai-inspirehep.net-1638158
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling oai-inspirehep.net-16381582021-02-09T10:06:00Zdoi:10.1088/1742-6596/898/5/052025http://cds.cern.ch/record/2298625engVlimant, Jean-RochSoftware and experience with managing workflows for the computing operation of the CMS experimentComputing and ComputersWe present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operation, including transferring input datasets where relevant and monitoring them, assigning work to computing resources available on the CMS grid, and delivering the output to the Physics groups. Automation is critical above a certain number of requests to be handled, especially in the view of using more efficiently computing resources and reducing latency. An effort to automatize the necessary steps for production and reprocessing recently started and a new system to handle workflows has been developed. The state-machine system described consists in a set of modules whose key feature is the automatic placement of input datasets, balancing the load across multiple sites. By reducing the operation overhead, these agents enable the utilization of more than double the amount of resources with robust storage system. Additional functionality were added after months of successful operation to further balance the load on the computing system using remote read and additional resources. This system contributed to reducing the delivery time of datasets, a crucial aspect to the analysis of CMS data. We report on lessons learned from operation towards increased efficiency in using a largely heterogeneous distributed system of computing, storage and network elements.oai:inspirehep.net:16381582017
spellingShingle Computing and Computers
Vlimant, Jean-Roch
Software and experience with managing workflows for the computing operation of the CMS experiment
title Software and experience with managing workflows for the computing operation of the CMS experiment
title_full Software and experience with managing workflows for the computing operation of the CMS experiment
title_fullStr Software and experience with managing workflows for the computing operation of the CMS experiment
title_full_unstemmed Software and experience with managing workflows for the computing operation of the CMS experiment
title_short Software and experience with managing workflows for the computing operation of the CMS experiment
title_sort software and experience with managing workflows for the computing operation of the cms experiment
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/898/5/052025
http://cds.cern.ch/record/2298625
work_keys_str_mv AT vlimantjeanroch softwareandexperiencewithmanagingworkflowsforthecomputingoperationofthecmsexperiment