Cargando…
Software and experience with managing workflows for the computing operation of the CMS experiment
We present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operati...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/898/5/052025 http://cds.cern.ch/record/2298625 |
_version_ | 1780957026994618368 |
---|---|
author | Vlimant, Jean-Roch |
author_facet | Vlimant, Jean-Roch |
author_sort | Vlimant, Jean-Roch |
collection | CERN |
description | We present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operation, including transferring input datasets where relevant and monitoring them, assigning work to computing resources available on the CMS grid, and delivering the output to the Physics groups. Automation is critical above a certain number of requests to be handled, especially in the view of using more efficiently computing resources and reducing latency. An effort to automatize the necessary steps for production and reprocessing recently started and a new system to handle workflows has been developed. The state-machine system described consists in a set of modules whose key feature is the automatic placement of input datasets, balancing the load across multiple sites. By reducing the operation overhead, these agents enable the utilization of more than double the amount of resources with robust storage system. Additional functionality were added after months of successful operation to further balance the load on the computing system using remote read and additional resources. This system contributed to reducing the delivery time of datasets, a crucial aspect to the analysis of CMS data. We report on lessons learned from operation towards increased efficiency in using a largely heterogeneous distributed system of computing, storage and network elements. |
id | oai-inspirehep.net-1638158 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | oai-inspirehep.net-16381582021-02-09T10:06:00Zdoi:10.1088/1742-6596/898/5/052025http://cds.cern.ch/record/2298625engVlimant, Jean-RochSoftware and experience with managing workflows for the computing operation of the CMS experimentComputing and ComputersWe present a system deployed in the summer of 2015 for the automatic assignment of production and reprocessing workflows for simulation and detector data in the frame of the Computing Operation of the CMS experiment at the CERN LHC. Processing requests involves a number of steps in the daily operation, including transferring input datasets where relevant and monitoring them, assigning work to computing resources available on the CMS grid, and delivering the output to the Physics groups. Automation is critical above a certain number of requests to be handled, especially in the view of using more efficiently computing resources and reducing latency. An effort to automatize the necessary steps for production and reprocessing recently started and a new system to handle workflows has been developed. The state-machine system described consists in a set of modules whose key feature is the automatic placement of input datasets, balancing the load across multiple sites. By reducing the operation overhead, these agents enable the utilization of more than double the amount of resources with robust storage system. Additional functionality were added after months of successful operation to further balance the load on the computing system using remote read and additional resources. This system contributed to reducing the delivery time of datasets, a crucial aspect to the analysis of CMS data. We report on lessons learned from operation towards increased efficiency in using a largely heterogeneous distributed system of computing, storage and network elements.oai:inspirehep.net:16381582017 |
spellingShingle | Computing and Computers Vlimant, Jean-Roch Software and experience with managing workflows for the computing operation of the CMS experiment |
title | Software and experience with managing workflows for the computing operation of the CMS experiment |
title_full | Software and experience with managing workflows for the computing operation of the CMS experiment |
title_fullStr | Software and experience with managing workflows for the computing operation of the CMS experiment |
title_full_unstemmed | Software and experience with managing workflows for the computing operation of the CMS experiment |
title_short | Software and experience with managing workflows for the computing operation of the CMS experiment |
title_sort | software and experience with managing workflows for the computing operation of the cms experiment |
topic | Computing and Computers |
url | https://dx.doi.org/10.1088/1742-6596/898/5/052025 http://cds.cern.ch/record/2298625 |
work_keys_str_mv | AT vlimantjeanroch softwareandexperiencewithmanagingworkflowsforthecomputingoperationofthecmsexperiment |