Cargando…

Real-time complex event processing for cloud resources

The ongoing integration of clouds into the WLCG raises the need for detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to...

Descripción completa

Detalles Bibliográficos
Autores principales: Adam, M, Cordeiro, C, Field, L, Giordano, D, Magnoni, L
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/898/4/042020
http://cds.cern.ch/record/2297282
_version_ 1780956898835562496
author Adam, M
Cordeiro, C
Field, L
Giordano, D
Magnoni, L
author_facet Adam, M
Cordeiro, C
Field, L
Giordano, D
Magnoni, L
author_sort Adam, M
collection CERN
description The ongoing integration of clouds into the WLCG raises the need for detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to a metric overflow whereby the operators need to manually collect and correlate data from several monitoring tools and frameworks, resulting in tens of different metrics to be constantly interpreted and analyzed per virtual machine. In this paper we present an ESPER based standalone application which is able to process complex monitoring events coming from various sources and automatically interpret data in order to issue alarms upon the resources’ statuses, without interfering with the actual resources and data sources. We will describe how this application has been used with both commercial and non-commercial cloud activities, allowing the operators to quickly be alarmed and react to misbehaving VMs and LHC experiments’ workflows. We will present the pattern analysis mechanisms being used, as well as the surrounding Elastic and REST API interfaces where the alarms are collected and served to users.
id oai-inspirehep.net-1638289
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
record_format invenio
spelling oai-inspirehep.net-16382892021-02-09T10:07:03Zdoi:10.1088/1742-6596/898/4/042020http://cds.cern.ch/record/2297282engAdam, MCordeiro, CField, LGiordano, DMagnoni, LReal-time complex event processing for cloud resourcesComputing and ComputersThe ongoing integration of clouds into the WLCG raises the need for detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to a metric overflow whereby the operators need to manually collect and correlate data from several monitoring tools and frameworks, resulting in tens of different metrics to be constantly interpreted and analyzed per virtual machine. In this paper we present an ESPER based standalone application which is able to process complex monitoring events coming from various sources and automatically interpret data in order to issue alarms upon the resources’ statuses, without interfering with the actual resources and data sources. We will describe how this application has been used with both commercial and non-commercial cloud activities, allowing the operators to quickly be alarmed and react to misbehaving VMs and LHC experiments’ workflows. We will present the pattern analysis mechanisms being used, as well as the surrounding Elastic and REST API interfaces where the alarms are collected and served to users.oai:inspirehep.net:16382892017
spellingShingle Computing and Computers
Adam, M
Cordeiro, C
Field, L
Giordano, D
Magnoni, L
Real-time complex event processing for cloud resources
title Real-time complex event processing for cloud resources
title_full Real-time complex event processing for cloud resources
title_fullStr Real-time complex event processing for cloud resources
title_full_unstemmed Real-time complex event processing for cloud resources
title_short Real-time complex event processing for cloud resources
title_sort real-time complex event processing for cloud resources
topic Computing and Computers
url https://dx.doi.org/10.1088/1742-6596/898/4/042020
http://cds.cern.ch/record/2297282
work_keys_str_mv AT adamm realtimecomplexeventprocessingforcloudresources
AT cordeiroc realtimecomplexeventprocessingforcloudresources
AT fieldl realtimecomplexeventprocessingforcloudresources
AT giordanod realtimecomplexeventprocessingforcloudresources
AT magnonil realtimecomplexeventprocessingforcloudresources