Cargando…
Real-time complex event processing for cloud resources
The ongoing integration of clouds into the WLCG raises the need for detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to...
Autores principales: | , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1088/1742-6596/898/4/042020 http://cds.cern.ch/record/2297282 |
_version_ | 1780956898835562496 |
---|---|
author | Adam, M Cordeiro, C Field, L Giordano, D Magnoni, L |
author_facet | Adam, M Cordeiro, C Field, L Giordano, D Magnoni, L |
author_sort | Adam, M |
collection | CERN |
description | The ongoing integration of clouds into the WLCG raises the need for detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to a metric overflow whereby the operators need to manually collect and correlate data from several monitoring tools and frameworks, resulting in tens of different metrics to be constantly interpreted and analyzed per virtual machine. In this paper we present an ESPER based standalone application which is able to process complex monitoring events coming from various sources and automatically interpret data in order to issue alarms upon the resources’ statuses, without interfering with the actual resources and data sources. We will describe how this application has been used with both commercial and non-commercial cloud activities, allowing the operators to quickly be alarmed and react to misbehaving VMs and LHC experiments’ workflows. We will present the pattern analysis mechanisms being used, as well as the surrounding Elastic and REST API interfaces where the alarms are collected and served to users. |
id | oai-inspirehep.net-1638289 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2017 |
record_format | invenio |
spelling | oai-inspirehep.net-16382892021-02-09T10:07:03Zdoi:10.1088/1742-6596/898/4/042020http://cds.cern.ch/record/2297282engAdam, MCordeiro, CField, LGiordano, DMagnoni, LReal-time complex event processing for cloud resourcesComputing and ComputersThe ongoing integration of clouds into the WLCG raises the need for detailed health and performance monitoring of the virtual resources in order to prevent problems of degraded service and interruptions due to undetected failures. When working in scale, the existing monitoring diversity can lead to a metric overflow whereby the operators need to manually collect and correlate data from several monitoring tools and frameworks, resulting in tens of different metrics to be constantly interpreted and analyzed per virtual machine. In this paper we present an ESPER based standalone application which is able to process complex monitoring events coming from various sources and automatically interpret data in order to issue alarms upon the resources’ statuses, without interfering with the actual resources and data sources. We will describe how this application has been used with both commercial and non-commercial cloud activities, allowing the operators to quickly be alarmed and react to misbehaving VMs and LHC experiments’ workflows. We will present the pattern analysis mechanisms being used, as well as the surrounding Elastic and REST API interfaces where the alarms are collected and served to users.oai:inspirehep.net:16382892017 |
spellingShingle | Computing and Computers Adam, M Cordeiro, C Field, L Giordano, D Magnoni, L Real-time complex event processing for cloud resources |
title | Real-time complex event processing for cloud resources |
title_full | Real-time complex event processing for cloud resources |
title_fullStr | Real-time complex event processing for cloud resources |
title_full_unstemmed | Real-time complex event processing for cloud resources |
title_short | Real-time complex event processing for cloud resources |
title_sort | real-time complex event processing for cloud resources |
topic | Computing and Computers |
url | https://dx.doi.org/10.1088/1742-6596/898/4/042020 http://cds.cern.ch/record/2297282 |
work_keys_str_mv | AT adamm realtimecomplexeventprocessingforcloudresources AT cordeiroc realtimecomplexeventprocessingforcloudresources AT fieldl realtimecomplexeventprocessingforcloudresources AT giordanod realtimecomplexeventprocessingforcloudresources AT magnonil realtimecomplexeventprocessingforcloudresources |