Cargando…
Improved ATLAS HammerCloud Monitoring for local Site Administration
Every day hundreds of tests are run on the Worldwide LHC Computing Grid for the ATLAS, CMS, and LHCb experiments in order to evaluate the performance and reliability of the different computing sites. All this activity is steered, controlled, and monitored by the HammerCloud testing infrastructure. S...
Autores principales: | , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2015
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2001844 |
Sumario: | Every day hundreds of tests are run on the Worldwide LHC Computing Grid for the ATLAS, CMS, and LHCb experiments in order to evaluate the performance and reliability of the different computing sites. All this activity is steered, controlled, and monitored by the HammerCloud testing infrastructure. Sites with failing functionality tests are auto-excluded from the ATLAS computing grid, therefore it is essential to provide a detailed and well organized web interface for the local site administrators such that they can easily spot and promptly solve site issues. Additional functionalities have been developed to extract and visualize the most relevant information. The site administrators can now be pointed easily to major site issues which lead to site blacklisting as well as possible minor issues that are usually not conspicuous enough to warrant the blacklisting of a specific site, but can still cause undesired effects such as a non-negligible job failure rate. This contribution summarizes the different developments and optimizations of the HammerCloud web interface and gives an overview of typical use cases. |
---|