Cargando…

Effective HTCondor-based monitoring system for CMS

The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had...

Descripción completa

Detalles Bibliográficos
Autores principales: Balcas, J, Bockelman, B P, Da Silva, J M, Hernandez, J, Khan, F A, Letts, J, Mascheroni, M, Mason, D A, Perez-Calero Yzquierdo, A, Vlimant, J R
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/898/9/092039
http://cds.cern.ch/record/2298469
_version_ 1780956988971155456
author Balcas, J
Bockelman, B P
Da Silva, J M
Hernandez, J
Khan, F A
Letts, J
Mascheroni, M
Mason, D A
Perez-Calero Yzquierdo, A
Vlimant, J R
author_facet Balcas, J
Bockelman, B P
Da Silva, J M
Hernandez, J
Khan, F A
Letts, J
Mascheroni, M
Mason, D A
Perez-Calero Yzquierdo, A
Vlimant, J R
author_sort Balcas, J
collection CERN
description The CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.
id oai-inspirehep.net-1638217
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling oai-inspirehep.net-16382172021-02-09T10:05:43Zdoi:10.1088/1742-6596/898/9/092039http://cds.cern.ch/record/2298469engBalcas, JBockelman, B PDa Silva, J MHernandez, JKhan, F ALetts, JMascheroni, MMason, D APerez-Calero Yzquierdo, AVlimant, J REffective HTCondor-based monitoring system for CMSComputing and ComputersThe CMS experiment at the LHC relies on HTCondor and glideinWMS as its primary batch and pilot-based Grid provisioning systems, respectively. Given the scale of the global queue in CMS, the operators found it increasingly difficult to monitor the pool to find problems and fix them. The operators had to rely on several different web pages, with several different levels of information, and sift tirelessly through log files in order to monitor the pool completely. Therefore, coming up with a suitable monitoring system was one of the crucial items before the beginning of the LHC Run 2 in order to ensure early detection of issues and to give a good overview of the whole pool. Our new monitoring page utilizes the HTCondor ClassAd information to provide a complete picture of the whole submission infrastructure in CMS. The monitoring page includes useful information from HTCondor schedulers, central managers, the glideinWMS frontend, and factories. It also incorporates information about users and tasks making it easy for operators to provide support and debug issues.oai:inspirehep.net:16382172017
spellingShingle Computing and Computers
Balcas, J
Bockelman, B P
Da Silva, J M
Hernandez, J
Khan, F A
Letts, J
Mascheroni, M
Mason, D A
Perez-Calero Yzquierdo, A
Vlimant, J R
Effective HTCondor-based monitoring system for CMS
title Effective HTCondor-based monitoring system for CMS
title_full Effective HTCondor-based monitoring system for CMS
title_fullStr Effective HTCondor-based monitoring system for CMS
title_full_unstemmed Effective HTCondor-based monitoring system for CMS
title_short Effective HTCondor-based monitoring system for CMS
title_sort effective htcondor-based monitoring system for cms
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/898/9/092039
http://cds.cern.ch/record/2298469
work_keys_str_mv AT balcasj effectivehtcondorbasedmonitoringsystemforcms
AT bockelmanbp effectivehtcondorbasedmonitoringsystemforcms
AT dasilvajm effectivehtcondorbasedmonitoringsystemforcms
AT hernandezj effectivehtcondorbasedmonitoringsystemforcms
AT khanfa effectivehtcondorbasedmonitoringsystemforcms
AT lettsj effectivehtcondorbasedmonitoringsystemforcms
AT mascheronim effectivehtcondorbasedmonitoringsystemforcms
AT masonda effectivehtcondorbasedmonitoringsystemforcms
AT perezcaleroyzquierdoa effectivehtcondorbasedmonitoringsystemforcms
AT vlimantjr effectivehtcondorbasedmonitoringsystemforcms