Cargando…

Evolution of CMS Workload Management Towards Multicore Job Support

The successful exploitation of multicore processor architectures is a key element of the LHC distributed computing system in the coming era of the LHC Run 2. High-pileup complex-collision events represent a challenge for the traditional sequential programming in terms of memory and processing time b...

Descripción completa

Detalles Bibliográficos
Autores principales: Perez-Calero Yzquierdo, A, Hernández, J M, Khan, F A, Letts, J, Majewski, K, Rodrigues, A M, McCrea, A, Vaandering, E
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/664/6/062046
http://cds.cern.ch/record/2134609
_version_ 1780949915290042368
author Perez-Calero Yzquierdo, A
Hernández, J M
Khan, F A
Letts, J
Majewski, K
Rodrigues, A M
McCrea, A
Vaandering, E
author_facet Perez-Calero Yzquierdo, A
Hernández, J M
Khan, F A
Letts, J
Majewski, K
Rodrigues, A M
McCrea, A
Vaandering, E
author_sort Perez-Calero Yzquierdo, A
collection CERN
description The successful exploitation of multicore processor architectures is a key element of the LHC distributed computing system in the coming era of the LHC Run 2. High-pileup complex-collision events represent a challenge for the traditional sequential programming in terms of memory and processing time budget. The CMS data production and processing framework is introducing the parallel execution of the reconstruction and simulation algorithms to overcome these limitations. CMS plans to execute multicore jobs while still supporting singlecore processing for other tasks difficult to parallelize, such as user analysis. The CMS strategy for job management thus aims at integrating single and multicore job scheduling across the Grid. This is accomplished by employing multicore pilots with internal dynamic partitioning of the allocated resources, capable of running payloads of various core counts simultaneously. An extensive test programme has been conducted to enable multicore scheduling with the various local batch systems available at CMS sites, with the focus on the Tier-0 and Tier-1s, responsible during 2015 of the prompt data reconstruction. Scale tests have been run to analyse the performance of this scheduling strategy and ensure an efficient use of the distributed resources. This paper presents the evolution of the CMS job management and resource provisioning systems in order to support this hybrid scheduling model, as well as its deployment and performance tests, which will enable CMS to transition to a multicore production model for the second LHC run.
id oai-inspirehep.net-1413965
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2015
record_format invenio
spelling oai-inspirehep.net-14139652022-08-10T13:01:00Zdoi:10.1088/1742-6596/664/6/062046http://cds.cern.ch/record/2134609engPerez-Calero Yzquierdo, AHernández, J MKhan, F ALetts, JMajewski, KRodrigues, A MMcCrea, AVaandering, EEvolution of CMS Workload Management Towards Multicore Job SupportComputing and ComputersThe successful exploitation of multicore processor architectures is a key element of the LHC distributed computing system in the coming era of the LHC Run 2. High-pileup complex-collision events represent a challenge for the traditional sequential programming in terms of memory and processing time budget. The CMS data production and processing framework is introducing the parallel execution of the reconstruction and simulation algorithms to overcome these limitations. CMS plans to execute multicore jobs while still supporting singlecore processing for other tasks difficult to parallelize, such as user analysis. The CMS strategy for job management thus aims at integrating single and multicore job scheduling across the Grid. This is accomplished by employing multicore pilots with internal dynamic partitioning of the allocated resources, capable of running payloads of various core counts simultaneously. An extensive test programme has been conducted to enable multicore scheduling with the various local batch systems available at CMS sites, with the focus on the Tier-0 and Tier-1s, responsible during 2015 of the prompt data reconstruction. Scale tests have been run to analyse the performance of this scheduling strategy and ensure an efficient use of the distributed resources. This paper presents the evolution of the CMS job management and resource provisioning systems in order to support this hybrid scheduling model, as well as its deployment and performance tests, which will enable CMS to transition to a multicore production model for the second LHC run.FERMILAB-CONF-15-607-CDoai:inspirehep.net:14139652015
spellingShingle Computing and Computers
Perez-Calero Yzquierdo, A
Hernández, J M
Khan, F A
Letts, J
Majewski, K
Rodrigues, A M
McCrea, A
Vaandering, E
Evolution of CMS Workload Management Towards Multicore Job Support
title Evolution of CMS Workload Management Towards Multicore Job Support
title_full Evolution of CMS Workload Management Towards Multicore Job Support
title_fullStr Evolution of CMS Workload Management Towards Multicore Job Support
title_full_unstemmed Evolution of CMS Workload Management Towards Multicore Job Support
title_short Evolution of CMS Workload Management Towards Multicore Job Support
title_sort evolution of cms workload management towards multicore job support
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/664/6/062046
http://cds.cern.ch/record/2134609
work_keys_str_mv AT perezcaleroyzquierdoa evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT hernandezjm evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT khanfa evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT lettsj evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT majewskik evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT rodriguesam evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT mccreaa evolutionofcmsworkloadmanagementtowardsmulticorejobsupport
AT vaanderinge evolutionofcmsworkloadmanagementtowardsmulticorejobsupport