Cargando…
An Assessment of a Model for Error Processing in the CMS Data Acquisition System
The CMS Data Acquisition System consists of O(20000) interdependent services. A system providing exception and application-specific monitoring data is essential for the operation of such a cluster. Due to the number of involved services the amount of monitoring data is higher than a human operator c...
Autores principales: | , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2009
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/219/2/022039 http://cds.cern.ch/record/1196135 |
_version_ | 1780917063308541952 |
---|---|
author | Dustdar, Schahram Gutleber, Johannes Moser, Roland Orsini, Luciano |
author_facet | Dustdar, Schahram Gutleber, Johannes Moser, Roland Orsini, Luciano |
author_sort | Dustdar, Schahram |
collection | CERN |
description | The CMS Data Acquisition System consists of O(20000) interdependent services. A system providing exception and application-specific monitoring data is essential for the operation of such a cluster. Due to the number of involved services the amount of monitoring data is higher than a human operator can handle efficiently. Thus moving the expert-knowledge for error analysis from the operator to a dedicated system is a natural choice. This reduces the number of notifications to the operator for simpler visualization and provides meaningful error cause descriptions and suggestions for possible countermeasures. This paper discusses an architecture of a workflow-based hierarchical error analysis system based on Guardians for the CMS Data Acquisition System. Guardians provide a common interface for error analysis of a specific service or subsystem. To provide effective and complete error analysis, the requirements regarding information sources, monitoring and configuration, are analyzed. Formats for common notification types are defined and a generic Guardian based on Event-Condition-Action rules is presented as a proof-of-concept. |
id | cern-1196135 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2009 |
record_format | invenio |
spelling | cern-11961352019-09-30T06:29:59Zdoi:10.1088/1742-6596/219/2/022039http://cds.cern.ch/record/1196135engDustdar, SchahramGutleber, JohannesMoser, RolandOrsini, LucianoAn Assessment of a Model for Error Processing in the CMS Data Acquisition SystemDetectors and Experimental TechniquesThe CMS Data Acquisition System consists of O(20000) interdependent services. A system providing exception and application-specific monitoring data is essential for the operation of such a cluster. Due to the number of involved services the amount of monitoring data is higher than a human operator can handle efficiently. Thus moving the expert-knowledge for error analysis from the operator to a dedicated system is a natural choice. This reduces the number of notifications to the operator for simpler visualization and provides meaningful error cause descriptions and suggestions for possible countermeasures. This paper discusses an architecture of a workflow-based hierarchical error analysis system based on Guardians for the CMS Data Acquisition System. Guardians provide a common interface for error analysis of a specific service or subsystem. To provide effective and complete error analysis, the requirements regarding information sources, monitoring and configuration, are analyzed. Formats for common notification types are defined and a generic Guardian based on Event-Condition-Action rules is presented as a proof-of-concept.CMS-CR-2009-064oai:cds.cern.ch:11961352009-05-08 |
spellingShingle | Detectors and Experimental Techniques Dustdar, Schahram Gutleber, Johannes Moser, Roland Orsini, Luciano An Assessment of a Model for Error Processing in the CMS Data Acquisition System |
title | An Assessment of a Model for Error Processing in the CMS Data Acquisition System |
title_full | An Assessment of a Model for Error Processing in the CMS Data Acquisition System |
title_fullStr | An Assessment of a Model for Error Processing in the CMS Data Acquisition System |
title_full_unstemmed | An Assessment of a Model for Error Processing in the CMS Data Acquisition System |
title_short | An Assessment of a Model for Error Processing in the CMS Data Acquisition System |
title_sort | assessment of a model for error processing in the cms data acquisition system |
topic | Detectors and Experimental Techniques |
url | https://dx.doi.org/10.1088/1742-6596/219/2/022039 http://cds.cern.ch/record/1196135 |
work_keys_str_mv | AT dustdarschahram anassessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT gutleberjohannes anassessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT moserroland anassessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT orsiniluciano anassessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT dustdarschahram assessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT gutleberjohannes assessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT moserroland assessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem AT orsiniluciano assessmentofamodelforerrorprocessinginthecmsdataacquisitionsystem |