Cargando…
Effective HTCondor-based monitoring system for CMS
The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had...
Autores principales: | , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/898/9/092039 http://cds.cern.ch/record/2298469 |
_version_ | 1780956988971155456 |
---|---|
author | Balcas, J Bockelman, B P Da Silva, J M Hernandez, J Khan, F A Letts, J Mascheroni, M Mason, D A Perez-Calero Yzquierdo, A Vlimant, J R |
author_facet | Balcas, J Bockelman, B P Da Silva, J M Hernandez, J Khan, F A Letts, J Mascheroni, M Mason, D A Perez-Calero Yzquierdo, A Vlimant, J R |
author_sort | Balcas, J |
collection | CERN |
description | The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues. |
id | oai-inspirehep.net-1638217 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | oai-inspirehep.net-16382172021-02-09T10:05:43Zdoi:10.1088/1742-6596/898/9/092039http://cds.cern.ch/record/2298469engBalcas, JBockelman, B PDa Silva, J MHernandez, JKhan, F ALetts, JMascheroni, MMason, D APerez-Calero Yzquierdo, AVlimant, J REffective HTCondor-based monitoring system for CMSComputing and ComputersThe CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.oai:inspirehep.net:16382172017 |
spellingShingle | Computing and Computers Balcas, J Bockelman, B P Da Silva, J M Hernandez, J Khan, F A Letts, J Mascheroni, M Mason, D A Perez-Calero Yzquierdo, A Vlimant, J R Effective HTCondor-based monitoring system for CMS |
title | Effective HTCondor-based monitoring system for CMS |
title_full | Effective HTCondor-based monitoring system for CMS |
title_fullStr | Effective HTCondor-based monitoring system for CMS |
title_full_unstemmed | Effective HTCondor-based monitoring system for CMS |
title_short | Effective HTCondor-based monitoring system for CMS |
title_sort | effective htcondor-based monitoring system for cms |
topic | Computing and Computers |
url | https://dx.doi.org/10.1088/1742-6596/898/9/092039 http://cds.cern.ch/record/2298469 |
work_keys_str_mv | AT balcasj effectivehtcondorbasedmonitoringsystemforcms AT bockelmanbp effectivehtcondorbasedmonitoringsystemforcms AT dasilvajm effectivehtcondorbasedmonitoringsystemforcms AT hernandezj effectivehtcondorbasedmonitoringsystemforcms AT khanfa effectivehtcondorbasedmonitoringsystemforcms AT lettsj effectivehtcondorbasedmonitoringsystemforcms AT mascheronim effectivehtcondorbasedmonitoringsystemforcms AT masonda effectivehtcondorbasedmonitoringsystemforcms AT perezcaleroyzquierdoa effectivehtcondorbasedmonitoringsystemforcms AT vlimantjr effectivehtcondorbasedmonitoringsystemforcms |