Cargando…

Evolution of SAM in an Enhanced Model for Monitoring WLCG Services

It is four years now since the first prototypes of tools and tests started to monitor the Worldwide LHC Computing Grid (WLCG) services. One of these tools is the Service Availability Monitoring (SAM) framework, which superseded the SFT tool, and has become a keystone for the monthly WLCG availabilit...

Descripción completa

Detalles Bibliográficos
Autores principales: Collados, D, Shade, J, Traylen, S, Imamagic, E
Lenguaje:eng
Publicado: 2009
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/219/6/062008
http://cds.cern.ch/record/1177866
Descripción
Sumario:It is four years now since the first prototypes of tools and tests started to monitor the Worldwide LHC Computing Grid (WLCG) services. One of these tools is the Service Availability Monitoring (SAM) framework, which superseded the SFT tool, and has become a keystone for the monthly WLCG availability and reliability computations. During this time, the grid has evolved into a robust, production-level infrastructure, in no small part thanks to the extensive monitoring infrastructure which includes testing, visualization and reporting. Experience gained with monitoring has led to emerging grid monitoring standards, and provided valuable input for the Operations Automation Strategy aimed at the regionalization of monitoring services. This change in scope, together with an ever-increasing number of services and infrastructures, make enhancements in the architecture of existing monitoring tools a necessity. This paper describes the present architecture of SAM, an enhanced and distributed model for monitoring WLCG services, and the required changes in SAM to adopt this new model inside the EGEE-III project.