Cargando…

Service monitoring in the LHC experiments

The LHC experiments computing infrastructure is hosted in a distributed way across different computing centers in the Worldwide LHC Computing Grid (WLCG [1]) and needs to run with high reliability. It is therefore crucial to offer a unified view to shifters, who generally are not experts in the serv...

Descripción completa

Detalles Bibliográficos
Autores principales: Barreiro Megino, Fernando, Bernardoff, Vincent, da Silva Gomes, Diego, di Girolamo, Alessandro, Flix, Jos, Kreuzer, Peter, Roiser, Stefan
Lenguaje:eng
Publicado: 2012
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/396/3/032010
http://cds.cern.ch/record/1565919
_version_ 1780930948665180160
author Barreiro Megino, Fernando
Bernardoff, Vincent
da Silva Gomes, Diego
di Girolamo, Alessandro
Flix, Jos
Kreuzer, Peter
Roiser, Stefan
author_facet Barreiro Megino, Fernando
Bernardoff, Vincent
da Silva Gomes, Diego
di Girolamo, Alessandro
Flix, Jos
Kreuzer, Peter
Roiser, Stefan
author_sort Barreiro Megino, Fernando
collection CERN
description The LHC experiments computing infrastructure is hosted in a distributed way across different computing centers in the Worldwide LHC Computing Grid (WLCG [1]) and needs to run with high reliability. It is therefore crucial to offer a unified view to shifters, who generally are not experts in the services, and give them the ability to follow the status of resources and the health of critical systems in order to alert the experts whenever a system becomes unavailable. Several experiments have chosen to build their service monitoring on top of the flexible Service Level Status (SLS) framework developed by CERN IT. Based on examples from ATLAS, CMS and LHCb, this contribution will describe the complete development process of a service-monitoring instance and explain the deployment models that can be adopted.
id cern-1565919
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2012
record_format invenio
spelling cern-15659192022-08-17T13:25:22Zdoi:10.1088/1742-6596/396/3/032010http://cds.cern.ch/record/1565919engBarreiro Megino, FernandoBernardoff, Vincentda Silva Gomes, Diegodi Girolamo, AlessandroFlix, JosKreuzer, PeterRoiser, StefanService monitoring in the LHC experimentsComputing and ComputersThe LHC experiments computing infrastructure is hosted in a distributed way across different computing centers in the Worldwide LHC Computing Grid (WLCG [1]) and needs to run with high reliability. It is therefore crucial to offer a unified view to shifters, who generally are not experts in the services, and give them the ability to follow the status of resources and the health of critical systems in order to alert the experts whenever a system becomes unavailable. Several experiments have chosen to build their service monitoring on top of the flexible Service Level Status (SLS) framework developed by CERN IT. Based on examples from ATLAS, CMS and LHCb, this contribution will describe the complete development process of a service-monitoring instance and explain the deployment models that can be adopted.oai:cds.cern.ch:15659192012
spellingShingle Computing and Computers
Barreiro Megino, Fernando
Bernardoff, Vincent
da Silva Gomes, Diego
di Girolamo, Alessandro
Flix, Jos
Kreuzer, Peter
Roiser, Stefan
Service monitoring in the LHC experiments
title Service monitoring in the LHC experiments
title_full Service monitoring in the LHC experiments
title_fullStr Service monitoring in the LHC experiments
title_full_unstemmed Service monitoring in the LHC experiments
title_short Service monitoring in the LHC experiments
title_sort service monitoring in the lhc experiments
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/396/3/032010
http://cds.cern.ch/record/1565919
work_keys_str_mv AT barreiromeginofernando servicemonitoringinthelhcexperiments
AT bernardoffvincent servicemonitoringinthelhcexperiments
AT dasilvagomesdiego servicemonitoringinthelhcexperiments
AT digirolamoalessandro servicemonitoringinthelhcexperiments
AT flixjos servicemonitoringinthelhcexperiments
AT kreuzerpeter servicemonitoringinthelhcexperiments
AT roiserstefan servicemonitoringinthelhcexperiments