Cargando…
Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits
The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CP...
Autores principales: | , , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/898/5/052031 http://cds.cern.ch/record/2297171 |
_version_ | 1780956901495799808 |
---|---|
author | Balcas, J Bockelman, B Hufnagel, D Hurtado Anampa, K Aftab Khan, F Larson, K Letts, J Marra da Silva, J Mascheroni, M Mason, D Perez-Calero Yzquierdo, A Tiradani, A |
author_facet | Balcas, J Bockelman, B Hufnagel, D Hurtado Anampa, K Aftab Khan, F Larson, K Letts, J Marra da Silva, J Mascheroni, M Mason, D Perez-Calero Yzquierdo, A Tiradani, A |
author_sort | Balcas, J |
collection | CERN |
description | The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such as multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome. |
id | oai-inspirehep.net-1638488 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | oai-inspirehep.net-16384882021-02-09T10:07:49Zdoi:10.1088/1742-6596/898/5/052031http://cds.cern.ch/record/2297171engBalcas, JBockelman, BHufnagel, DHurtado Anampa, KAftab Khan, FLarson, KLetts, JMarra da Silva, JMascheroni, MMason, DPerez-Calero Yzquierdo, ATiradani, AStability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limitsComputing and ComputersThe CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such as multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome.oai:inspirehep.net:16384882017 |
spellingShingle | Computing and Computers Balcas, J Bockelman, B Hufnagel, D Hurtado Anampa, K Aftab Khan, F Larson, K Letts, J Marra da Silva, J Mascheroni, M Mason, D Perez-Calero Yzquierdo, A Tiradani, A Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits |
title | Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits |
title_full | Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits |
title_fullStr | Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits |
title_full_unstemmed | Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits |
title_short | Stability and scalability of the CMS Global Pool: Pushing HTCondor and glideinWMS to new limits |
title_sort | stability and scalability of the cms global pool: pushing htcondor and glideinwms to new limits |
topic | Computing and Computers |
url | https://dx.doi.org/10.1088/1742-6596/898/5/052031 http://cds.cern.ch/record/2297171 |
work_keys_str_mv | AT balcasj stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT bockelmanb stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT hufnageld stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT hurtadoanampak stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT aftabkhanf stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT larsonk stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT lettsj stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT marradasilvaj stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT mascheronim stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT masond stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT perezcaleroyzquierdoa stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits AT tiradania stabilityandscalabilityofthecmsglobalpoolpushinghtcondorandglideinwmstonewlimits |