Cargando…
Anomaly Detection for the Centralised Elasticsearch Service at CERN
For several years CERN has been offering a centralised service for Elasticsearch, a popular distributed system for search and analytics of user provided data. The service offered by CERN IT is better described as a service of services, delivering centrally managed and maintained Elasticsearch instan...
Autores principales: | , , |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.3389/fdata.2021.718879 http://cds.cern.ch/record/2801667 |
_version_ | 1780972712999518208 |
---|---|
author | Andersson, Jennifer R Moya, Jose Alonso Schwickerath, Ulrich |
author_facet | Andersson, Jennifer R Moya, Jose Alonso Schwickerath, Ulrich |
author_sort | Andersson, Jennifer R |
collection | CERN |
description | For several years CERN has been offering a centralised service for Elasticsearch, a popular
distributed system for search and analytics of user provided data. The service offered by
CERN IT is better described as a service of services, delivering centrally managed and
maintained Elasticsearch instances to CERN users who have a justified need for it. This
dynamic infrastructure currently consists of about 30 distinct and independent
Elasticsearch installations, in the following referred to as Elasticsearch clusters, some
of which are shared between different user communities. The service is used by several
hundred users mainly for logs and service analytics. Due to its size and complexity, the
installation produces a huge amount of internal monitoring data which can be difficult to
process in real time with limited available person power. Early on, an idea was therefore
born to process this data automatically, aiming to extract anomalies and possible issues
building up in real time, allowing the experts to address them before they start to cause an
issue for the users of the service. Both deep learning and traditional methods have been
applied to analyse the data in order to achieve this goal. This resulted in the current
deployment of an anomaly detection system based on a one layer multi dimensional LSTM
neural network, coupled with applying a simple moving average to the data to validate the
results. This paper will describe which methods were investigated and give an overview of
the current system, including data retrieval, data pre-processing and analysis. In addition,
reports on experiences gained when applying the system to actual data will be provided.
Finally, weaknesses of the current system will be briefly discussed, and ideas for future
system improvements will be sketched out. |
id | cern-2801667 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2021 |
record_format | invenio |
spelling | cern-28016672022-02-16T21:15:57Zdoi:10.3389/fdata.2021.718879http://cds.cern.ch/record/2801667engAndersson, Jennifer RMoya, Jose AlonsoSchwickerath, UlrichAnomaly Detection for the Centralised Elasticsearch Service at CERNComputing and ComputersFor several years CERN has been offering a centralised service for Elasticsearch, a popular distributed system for search and analytics of user provided data. The service offered by CERN IT is better described as a service of services, delivering centrally managed and maintained Elasticsearch instances to CERN users who have a justified need for it. This dynamic infrastructure currently consists of about 30 distinct and independent Elasticsearch installations, in the following referred to as Elasticsearch clusters, some of which are shared between different user communities. The service is used by several hundred users mainly for logs and service analytics. Due to its size and complexity, the installation produces a huge amount of internal monitoring data which can be difficult to process in real time with limited available person power. Early on, an idea was therefore born to process this data automatically, aiming to extract anomalies and possible issues building up in real time, allowing the experts to address them before they start to cause an issue for the users of the service. Both deep learning and traditional methods have been applied to analyse the data in order to achieve this goal. This resulted in the current deployment of an anomaly detection system based on a one layer multi dimensional LSTM neural network, coupled with applying a simple moving average to the data to validate the results. This paper will describe which methods were investigated and give an overview of the current system, including data retrieval, data pre-processing and analysis. In addition, reports on experiences gained when applying the system to actual data will be provided. Finally, weaknesses of the current system will be briefly discussed, and ideas for future system improvements will be sketched out.oai:cds.cern.ch:28016672021 |
spellingShingle | Computing and Computers Andersson, Jennifer R Moya, Jose Alonso Schwickerath, Ulrich Anomaly Detection for the Centralised Elasticsearch Service at CERN |
title | Anomaly Detection for the Centralised Elasticsearch Service at CERN |
title_full | Anomaly Detection for the Centralised Elasticsearch Service at CERN |
title_fullStr | Anomaly Detection for the Centralised Elasticsearch Service at CERN |
title_full_unstemmed | Anomaly Detection for the Centralised Elasticsearch Service at CERN |
title_short | Anomaly Detection for the Centralised Elasticsearch Service at CERN |
title_sort | anomaly detection for the centralised elasticsearch service at cern |
topic | Computing and Computers |
url | https://dx.doi.org/10.3389/fdata.2021.718879 http://cds.cern.ch/record/2801667 |
work_keys_str_mv | AT anderssonjenniferr anomalydetectionforthecentralisedelasticsearchserviceatcern AT moyajosealonso anomalydetectionforthecentralisedelasticsearchserviceatcern AT schwickerathulrich anomalydetectionforthecentralisedelasticsearchserviceatcern |