Cargando…
LHCb: Monitoring the DIRAC Distribution System
DIRAC is the LHCb gateway to any computing grid infrastructure (currently supporting WLCG) and is intended to reliably run large data mining activities. The DIRAC system consists of various services (which wait to be contacted to perform actions) and agents (which carry out periodic activities) to d...
Autores principales: | , , |
---|---|
Lenguaje: | eng |
Publicado: |
2009
|
Acceso en línea: | http://cds.cern.ch/record/1170452 |
_version_ | 1780916137663397888 |
---|---|
author | Nandakumar, R Seco Miguelez, M Santinelli, R |
author_facet | Nandakumar, R Seco Miguelez, M Santinelli, R |
author_sort | Nandakumar, R |
collection | CERN |
description | DIRAC is the LHCb gateway to any computing grid infrastructure (currently supporting WLCG) and is intended to reliably run large data mining activities. The DIRAC system consists of various services (which wait to be contacted to perform actions) and agents (which carry out periodic activities) to direct jobs as required. An important part of ensuring the reliability of the infrastructure is the monitoring and logging of these DIRAC distributed systems. The monitoring is done collecting information from two sources - one is from pinging the services or by keeping track of the regular heartbeats of the agents, and the other from the analysis of the error messages generated by both agents and services and collected by the logging system. This allows us to ensure that he components are running properly and to collect useful information regarding their operations. The process status monitoring is displayed using the SLS sensor mechanism which also automatically allows one to plot various quantities and also keep a history of the system. A dedicated GridMap interface (ServiceMap) allows production shifters and experts to have an immediate, high-impact view of all LHCb critical services status while offering the possibility to refer to details of the SLS and SAM sensors. Error types and statistics provided by the logging service can be accessed via dedicated web interfaces on the DIRAC portal or programmatically via the python based API and CLI. |
id | cern-1170452 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2009 |
record_format | invenio |
spelling | cern-11704522019-09-30T06:29:59Zhttp://cds.cern.ch/record/1170452engNandakumar, RSeco Miguelez, MSantinelli, RLHCb: Monitoring the DIRAC Distribution System DIRAC is the LHCb gateway to any computing grid infrastructure (currently supporting WLCG) and is intended to reliably run large data mining activities. The DIRAC system consists of various services (which wait to be contacted to perform actions) and agents (which carry out periodic activities) to direct jobs as required. An important part of ensuring the reliability of the infrastructure is the monitoring and logging of these DIRAC distributed systems. The monitoring is done collecting information from two sources - one is from pinging the services or by keeping track of the regular heartbeats of the agents, and the other from the analysis of the error messages generated by both agents and services and collected by the logging system. This allows us to ensure that he components are running properly and to collect useful information regarding their operations. The process status monitoring is displayed using the SLS sensor mechanism which also automatically allows one to plot various quantities and also keep a history of the system. A dedicated GridMap interface (ServiceMap) allows production shifters and experts to have an immediate, high-impact view of all LHCb critical services status while offering the possibility to refer to details of the SLS and SAM sensors. Error types and statistics provided by the logging service can be accessed via dedicated web interfaces on the DIRAC portal or programmatically via the python based API and CLI.Poster-2009-102oai:cds.cern.ch:11704522009-03-24 |
spellingShingle | Nandakumar, R Seco Miguelez, M Santinelli, R LHCb: Monitoring the DIRAC Distribution System |
title | LHCb: Monitoring the DIRAC Distribution System |
title_full | LHCb: Monitoring the DIRAC Distribution System |
title_fullStr | LHCb: Monitoring the DIRAC Distribution System |
title_full_unstemmed | LHCb: Monitoring the DIRAC Distribution System |
title_short | LHCb: Monitoring the DIRAC Distribution System |
title_sort | lhcb: monitoring the dirac distribution system |
url | http://cds.cern.ch/record/1170452 |
work_keys_str_mv | AT nandakumarr lhcbmonitoringthediracdistributionsystem AT secomiguelezm lhcbmonitoringthediracdistributionsystem AT santinellir lhcbmonitoringthediracdistributionsystem |