Cargando…
Anomaly detection in the CERN cloud infrastructure
Anomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarmin...
Autores principales: | , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1051/epjconf/202125102011 http://cds.cern.ch/record/2814353 |
_version_ | 1780973440244645888 |
---|---|
author | Giordano, Domenico Paltenghi, Matteo Metaj, Stiven Dvorak, Antonin |
author_facet | Giordano, Domenico Paltenghi, Matteo Metaj, Stiven Dvorak, Antonin |
author_sort | Giordano, Domenico |
collection | CERN |
description | Anomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each infrastructure’s component. This contribution explores fully automated, unsupervised machine learning solutions in the anomaly detection field for time series metrics, by adapting both traditional and deep learning approaches. The paper describes a novel end-to-end data analytics pipeline implemented to digest the large amount of monitoring data and to expose anomalies to the system managers. The pipeline relies solely on open-source tools and frameworks, such as $Spark$, $Apache$ $Airflow$, $Kubernetes$, $Grafana$, $Elasticsearch$. In addition, an approach to build annotated datasets from the CERN cloud monitoring data is reported. Finally, a preliminary performance of a number of anomaly detection algorithms is evaluated by using the aforementioned annotated datasets. |
id | cern-2814353 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2021 |
record_format | invenio |
spelling | cern-28143532022-07-25T15:37:09Zdoi:10.1051/epjconf/202125102011http://cds.cern.ch/record/2814353engGiordano, DomenicoPaltenghi, MatteoMetaj, StivenDvorak, AntoninAnomaly detection in the CERN cloud infrastructureComputing and ComputersAnomaly detection in the CERN OpenStack cloud is a challenging task due to the large scale of the computing infrastructure and, consequently, the large volume of monitoring data to analyse. The current solution to spot anomalous servers in the cloud infrastructure relies on a threshold-based alarming system carefully set by the system managers on the performance metrics of each infrastructure’s component. This contribution explores fully automated, unsupervised machine learning solutions in the anomaly detection field for time series metrics, by adapting both traditional and deep learning approaches. The paper describes a novel end-to-end data analytics pipeline implemented to digest the large amount of monitoring data and to expose anomalies to the system managers. The pipeline relies solely on open-source tools and frameworks, such as $Spark$, $Apache$ $Airflow$, $Kubernetes$, $Grafana$, $Elasticsearch$. In addition, an approach to build annotated datasets from the CERN cloud monitoring data is reported. Finally, a preliminary performance of a number of anomaly detection algorithms is evaluated by using the aforementioned annotated datasets.oai:cds.cern.ch:28143532021 |
spellingShingle | Computing and Computers Giordano, Domenico Paltenghi, Matteo Metaj, Stiven Dvorak, Antonin Anomaly detection in the CERN cloud infrastructure |
title | Anomaly detection in the CERN cloud infrastructure |
title_full | Anomaly detection in the CERN cloud infrastructure |
title_fullStr | Anomaly detection in the CERN cloud infrastructure |
title_full_unstemmed | Anomaly detection in the CERN cloud infrastructure |
title_short | Anomaly detection in the CERN cloud infrastructure |
title_sort | anomaly detection in the cern cloud infrastructure |
topic | Computing and Computers |
url | https://dx.doi.org/10.1051/epjconf/202125102011 http://cds.cern.ch/record/2814353 |
work_keys_str_mv | AT giordanodomenico anomalydetectioninthecerncloudinfrastructure AT paltenghimatteo anomalydetectioninthecerncloudinfrastructure AT metajstiven anomalydetectioninthecerncloudinfrastructure AT dvorakantonin anomalydetectioninthecerncloudinfrastructure |