Cargando…

LHCb: Monitoring the DIRAC Distribution System

DIRAC is the LHCb gateway to any computing grid infrastructure (currently supporting WLCG) and is intended to reliably run large data mining activities. The DIRAC system consists of various services (which wait to be contacted to perform actions) and agents (which carry out periodic activities) to d...

Descripción completa

Detalles Bibliográficos
Autores principales: Nandakumar, R, Seco Miguelez, M, Santinelli, R
Lenguaje:eng
Publicado: 2009
Acceso en línea:http://cds.cern.ch/record/1170452
_version_ 1780916137663397888
author Nandakumar, R
Seco Miguelez, M
Santinelli, R
author_facet Nandakumar, R
Seco Miguelez, M
Santinelli, R
author_sort Nandakumar, R
collection CERN
description DIRAC is the LHCb gateway to any computing grid infrastructure (currently supporting WLCG) and is intended to reliably run large data mining activities. The DIRAC system consists of various services (which wait to be contacted to perform actions) and agents (which carry out periodic activities) to direct jobs as required. An important part of ensuring the reliability of the infrastructure is the monitoring and logging of these DIRAC distributed systems. The monitoring is done collecting information from two sources - one is from pinging the services or by keeping track of the regular heartbeats of the agents, and the other from the analysis of the error messages generated by both agents and services and collected by the logging system. This allows us to ensure that he components are running properly and to collect useful information regarding their operations. The process status monitoring is displayed using the SLS sensor mechanism which also automatically allows one to plot various quantities and also keep a history of the system. A dedicated GridMap interface (ServiceMap) allows production shifters and experts to have an immediate, high-impact view of all LHCb critical services status while offering the possibility to refer to details of the SLS and SAM sensors. Error types and statistics provided by the logging service can be accessed via dedicated web interfaces on the DIRAC portal or programmatically via the python based API and CLI.
id cern-1170452
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2009
record_format invenio
spelling cern-11704522019-09-30T06:29:59Zhttp://cds.cern.ch/record/1170452engNandakumar, RSeco Miguelez, MSantinelli, RLHCb: Monitoring the DIRAC Distribution System DIRAC is the LHCb gateway to any computing grid infrastructure (currently supporting WLCG) and is intended to reliably run large data mining activities. The DIRAC system consists of various services (which wait to be contacted to perform actions) and agents (which carry out periodic activities) to direct jobs as required. An important part of ensuring the reliability of the infrastructure is the monitoring and logging of these DIRAC distributed systems. The monitoring is done collecting information from two sources - one is from pinging the services or by keeping track of the regular heartbeats of the agents, and the other from the analysis of the error messages generated by both agents and services and collected by the logging system. This allows us to ensure that he components are running properly and to collect useful information regarding their operations. The process status monitoring is displayed using the SLS sensor mechanism which also automatically allows one to plot various quantities and also keep a history of the system. A dedicated GridMap interface (ServiceMap) allows production shifters and experts to have an immediate, high-impact view of all LHCb critical services status while offering the possibility to refer to details of the SLS and SAM sensors. Error types and statistics provided by the logging service can be accessed via dedicated web interfaces on the DIRAC portal or programmatically via the python based API and CLI.Poster-2009-102oai:cds.cern.ch:11704522009-03-24
spellingShingle Nandakumar, R
Seco Miguelez, M
Santinelli, R
LHCb: Monitoring the DIRAC Distribution System
title LHCb: Monitoring the DIRAC Distribution System
title_full LHCb: Monitoring the DIRAC Distribution System
title_fullStr LHCb: Monitoring the DIRAC Distribution System
title_full_unstemmed LHCb: Monitoring the DIRAC Distribution System
title_short LHCb: Monitoring the DIRAC Distribution System
title_sort lhcb: monitoring the dirac distribution system
url http://cds.cern.ch/record/1170452
work_keys_str_mv AT nandakumarr lhcbmonitoringthediracdistributionsystem
AT secomiguelezm lhcbmonitoringthediracdistributionsystem
AT santinellir lhcbmonitoringthediracdistributionsystem