Cargando…

ATLAS Distributed Computing Monitoring tools during the LHC Run I

This contribution summarizes evolution of the ATLAS Distributed Computing (ADC) Monitoring project during the LHC Run I. The ADC Monitoring targets at the three groups of customers: ADC Operations team to early identify malfunctions and escalate issues to an activity or a service expert, ATLAS natio...

Descripción completa

Detalles Bibliográficos
Autores principales: Schovancova, J, Campana, S, Di Girolamo, A, Jezequel, S, Ueda, I, Wenaus, T
Lenguaje:eng
Publicado: 2013
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/513/3/032084
http://cds.cern.ch/record/1605894
Descripción
Sumario:This contribution summarizes evolution of the ATLAS Distributed Computing (ADC) Monitoring project during the LHC Run I. The ADC Monitoring targets at the three groups of customers: ADC Operations team to early identify malfunctions and escalate issues to an activity or a service expert, ATLAS national contacts and sites for the real-time monitoring and long-term measurement of the performance of the provided computing resources, and the ATLAS Management for long-term trends and accounting information about the ATLAS Distributed Computing resources.\\ During the LHC Run I a significant development effort has been invested in standardization of the monitoring and accounting applications in order to provide extensive monitoring and accounting suite. ADC Monitoring applications separate the data layer and the visualization layer. The data layer exposes data in a predefined format. The visualization layer is designed bearing in mind visual identity of the provided graphical elements, and re-usability of the visualization bits across the different tools. A rich family of various filtering and searching options enhancing available user interfaces comes naturally with the data and visualization layer separation.\\ With a variety of reliable monitoring dataccessible through standardized interfaces, the possibility of automating actions under well defined conditions correlating multiple data sources has become feasible. In this contribution we discuss also about the automated exclusion of degraded resources and their automated recovery in various activities.\\