Cargando…

Bringing the CMS distributed computing system into scalable operations

Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more...

Descripción completa

Detalles Bibliográficos
Autores principales: Belforte, S, Fanfani, A, Fisk, I, Flix, J, Hernández, J M, Kress, T, Letts, J, Magini, N, Miccio, V, Sciabà, A
Lenguaje:eng
Publicado: 2009
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/219/6/062015
http://cds.cern.ch/record/1196148
_version_ 1780917065913204736
author Belforte, S
Fanfani, A
Fisk, I
Flix, J
Hernández, J M
Kress, T
Letts, J
Magini, N
Miccio, V
Sciabà, A
author_facet Belforte, S
Fanfani, A
Fisk, I
Flix, J
Hernández, J M
Kress, T
Letts, J
Magini, N
Miccio, V
Sciabà, A
author_sort Belforte, S
collection CERN
description Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.
id cern-1196148
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2009
record_format invenio
spelling cern-11961482019-09-30T06:29:59Zdoi:10.1088/1742-6596/219/6/062015http://cds.cern.ch/record/1196148engBelforte, SFanfani, AFisk, IFlix, JHernández, J MKress, TLetts, JMagini, NMiccio, VSciabà, ABringing the CMS distributed computing system into scalable operationsDetectors and Experimental TechniquesEstablishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.CMS-CR-2009-087oai:cds.cern.ch:11961482009-05-13
spellingShingle Detectors and Experimental Techniques
Belforte, S
Fanfani, A
Fisk, I
Flix, J
Hernández, J M
Kress, T
Letts, J
Magini, N
Miccio, V
Sciabà, A
Bringing the CMS distributed computing system into scalable operations
title Bringing the CMS distributed computing system into scalable operations
title_full Bringing the CMS distributed computing system into scalable operations
title_fullStr Bringing the CMS distributed computing system into scalable operations
title_full_unstemmed Bringing the CMS distributed computing system into scalable operations
title_short Bringing the CMS distributed computing system into scalable operations
title_sort bringing the cms distributed computing system into scalable operations
topic Detectors and Experimental Techniques
url https://dx.doi.org/10.1088/1742-6596/219/6/062015
http://cds.cern.ch/record/1196148
work_keys_str_mv AT belfortes bringingthecmsdistributedcomputingsystemintoscalableoperations
AT fanfania bringingthecmsdistributedcomputingsystemintoscalableoperations
AT fiski bringingthecmsdistributedcomputingsystemintoscalableoperations
AT flixj bringingthecmsdistributedcomputingsystemintoscalableoperations
AT hernandezjm bringingthecmsdistributedcomputingsystemintoscalableoperations
AT kresst bringingthecmsdistributedcomputingsystemintoscalableoperations
AT lettsj bringingthecmsdistributedcomputingsystemintoscalableoperations
AT maginin bringingthecmsdistributedcomputingsystemintoscalableoperations
AT micciov bringingthecmsdistributedcomputingsystemintoscalableoperations
AT sciabaa bringingthecmsdistributedcomputingsystemintoscalableoperations