Cargando…

Evaluation and implementation of CEP mechanisms to act upon infrastructure metrics monitored by Ganglia

The LHC experiments are progressively moving towards computing resources that are provided dynamically by Cloud services. It is important to monitor the health and performance of the virtual machines of these dynamic clusters and to provide early warnings in order to prevent the problems of degraded...

Descripción completa

Detalles Bibliográficos
Autor principal: Adam, Martin
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:http://cds.cern.ch/record/2049241
Descripción
Sumario:The LHC experiments are progressively moving towards computing resources that are provided dynamically by Cloud services. It is important to monitor the health and performance of the virtual machines of these dynamic clusters and to provide early warnings in order to prevent the problems of degraded service and interruptions due to eventual failures of the cluster nodes. The goal of the project is to develop a system that will digest monitoring information coming from the cluster, analyze it almost in real time and provide necessary input for the control engine of the workload management systems of the experiments. The system should be generic and not coupled to any experiment frameworks, so that it can be used by any LHC experiment.