Cargando…

Anomaly detection in the CERN cloud infrastructure

Anomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarmin...

Descripción completa

Detalles Bibliográficos
Autores principales: Giordano, Domenico, Paltenghi, Matteo, Metaj, Stiven, Dvorak, Antonin
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202125102011
http://cds.cern.ch/record/2814353
_version_ 1780973440244645888
author Giordano, Domenico
Paltenghi, Matteo
Metaj, Stiven
Dvorak, Antonin
author_facet Giordano, Domenico
Paltenghi, Matteo
Metaj, Stiven
Dvorak, Antonin
author_sort Giordano, Domenico
collection CERN
description Anomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each infrastructure’s component. This contribution explores fully automated, unsupervised machine learning solutions in the anomaly detection field for time series metrics, by adapting both traditional and deep learning approaches. The paper describes a novel end-to-end data analytics pipeline implemented to digest the large amount of monitoring data and to expose anomalies to the system managers. The pipeline relies solely on open-source tools and frameworks, such as $Spark$, $Apache$ $Airflow$, $Kubernetes$, $Grafana$, $Elasticsearch$. In addition, an approach to build annotated datasets from the CERN cloud monitoring data is reported. Finally, a preliminary performance of a number of anomaly detection algorithms is evaluated by using the aforementioned annotated datasets.
id cern-2814353
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-28143532022-07-25T15:37:09Zdoi:10.1051/epjconf/202125102011http://cds.cern.ch/record/2814353engGiordano, DomenicoPaltenghi, MatteoMetaj, StivenDvorak, AntoninAnomaly detection in the CERN cloud infrastructureComputing and ComputersAnomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each infrastructure’s component. This contribution explores fully automated, unsupervised machine learning solutions in the anomaly detection field for time series metrics, by adapting both traditional and deep learning approaches. The paper describes a novel end-to-end data analytics pipeline implemented to digest the large amount of monitoring data and to expose anomalies to the system managers. The pipeline relies solely on open-source tools and frameworks, such as $Spark$, $Apache$ $Airflow$, $Kubernetes$, $Grafana$, $Elasticsearch$. In addition, an approach to build annotated datasets from the CERN cloud monitoring data is reported. Finally, a preliminary performance of a number of anomaly detection algorithms is evaluated by using the aforementioned annotated datasets.oai:cds.cern.ch:28143532021
spellingShingle Computing and Computers
Giordano, Domenico
Paltenghi, Matteo
Metaj, Stiven
Dvorak, Antonin
Anomaly detection in the CERN cloud infrastructure
title Anomaly detection in the CERN cloud infrastructure
title_full Anomaly detection in the CERN cloud infrastructure
title_fullStr Anomaly detection in the CERN cloud infrastructure
title_full_unstemmed Anomaly detection in the CERN cloud infrastructure
title_short Anomaly detection in the CERN cloud infrastructure
title_sort anomaly detection in the cern cloud infrastructure
topic Computing and Computers
url https://dx.doi.org/10.1051/epjconf/202125102011
http://cds.cern.ch/record/2814353
work_keys_str_mv AT giordanodomenico anomalydetectioninthecerncloudinfrastructure
AT paltenghimatteo anomalydetectioninthecerncloudinfrastructure
AT metajstiven anomalydetectioninthecerncloudinfrastructure
AT dvorakantonin anomalydetectioninthecerncloudinfrastructure