Cargando…

Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits

The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CP...

Descripción completa

Detalles Bibliográficos
Autores principales: Balcas, J, Bockelman, B, Hufnagel, D, Hurtado Anampa, K, Aftab Khan, F, Larson, K, Letts, J, Marra da Silva, J, Mascheroni, M, Mason, D, Perez-Calero Yzquierdo, A, Tiradani, A
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/898/5/052031
http://cds.cern.ch/record/2297171
_version_ 1780956901495799808
author Balcas, J
Bockelman, B
Hufnagel, D
Hurtado Anampa, K
Aftab Khan, F
Larson, K
Letts, J
Marra da Silva, J
Mascheroni, M
Mason, D
Perez-Calero Yzquierdo, A
Tiradani, A
author_facet Balcas, J
Bockelman, B
Hufnagel, D
Hurtado Anampa, K
Aftab Khan, F
Larson, K
Letts, J
Marra da Silva, J
Mascheroni, M
Mason, D
Perez-Calero Yzquierdo, A
Tiradani, A
author_sort Balcas, J
collection CERN
description The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such as multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome.
id oai-inspirehep.net-1638488
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling oai-inspirehep.net-16384882021-02-09T10:07:49Zdoi:10.1088/1742-6596/898/5/052031http://cds.cern.ch/record/2297171engBalcas, JBockelman, BHufnagel, DHurtado Anampa, KAftab Khan, FLarson, KLetts, JMarra da Silva, JMascheroni, MMason, DPerez-Calero Yzquierdo, ATiradani, AStability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limitsComputing and ComputersThe CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such as multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome.oai:inspirehep.net:16384882017
spellingShingle Computing and Computers
Balcas, J
Bockelman, B
Hufnagel, D
Hurtado Anampa, K
Aftab Khan, F
Larson, K
Letts, J
Marra da Silva, J
Mascheroni, M
Mason, D
Perez-Calero Yzquierdo, A
Tiradani, A
Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
title Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
title_full Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
title_fullStr Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
title_full_unstemmed Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
title_short Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
title_sort stability and scalability of the cms global pool: pushing htcondor and glideinwms to new limits
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/898/5/052031
http://cds.cern.ch/record/2297171
work_keys_str_mv AT balcasj stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT bockelmanb stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT hufnageld stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT hurtadoanampak stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT aftabkhanf stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT larsonk stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT lettsj stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT marradasilvaj stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT mascheronim stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT masond stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT perezcaleroyzquierdoa stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits
AT tiradania stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits