Cargando…

Computing Resource Optimization for a Log Monitoring System

A Large Ion Collider Experiment (ALICE) at the Large Hadron Collider (LHC) in the European Organization for Nuclear Research (CERN) laboratory was built to study heavy-ion collisions and the properties of the quark-gluon plasma. The Online and Offline (O2) software systems of the experiment generate...

Descripción completa

Detalles Bibliográficos
Autores principales: Srithai, Thanin, Barroso, Vasco Chibante, Phunchongharn, Phond
Lenguaje:eng
Publicado: 2022
Materias:
Acceso en línea:https://dx.doi.org/10.1109/ICKII55100.2022.9983580
http://cds.cern.ch/record/2846164
_version_ 1780976619034247168
author Srithai, Thanin
Barroso, Vasco Chibante
Phunchongharn, Phond
author_facet Srithai, Thanin
Barroso, Vasco Chibante
Phunchongharn, Phond
author_sort Srithai, Thanin
collection CERN
description A Large Ion Collider Experiment (ALICE) at the Large Hadron Collider (LHC) in the European Organization for Nuclear Research (CERN) laboratory was built to study heavy-ion collisions and the properties of the quark-gluon plasma. The Online and Offline (O2) software systems of the experiment generate a huge amount of log data that is used for monitoring to detect a potential system failure. Elasticsearch was selected as a log storage and search engine for the monitoring system. One of the main problems is how to allocate the computing resources for Elasticsearch while minimizing cost and satisfying performance thresholds, i.e., throughput). Moreover, lacking knowledge of the search engine's behavior makes it difficult to find the best configuration. The exhaustive search method is a potential approach for solving. However, it is not practical since it consumes a lot of time and computing resources. Due to the limited resources, Bayesian optimization is applied as a solution. The Bayesian method requires only a few samples to create a surrogate function that roughly represents the objective function, i.e., minimizing cost while satisfying the performance needs. Then, the method explores only the area where the optimal solution exists with a high probability. The results show that Bayesian optimization provides the optimal or near-optimal computing resource configuration for given benchmark experiments while requiring only about half of the evaluations compared to other methods, e.g., exhaustive search, regression, and machine learning. The impact of several acquisition functions and initial sample generators were studied in order to find the best solution. These insights can help system operators search for an optimal computing resource configuration quickly and efficiently.
id cern-2846164
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2022
record_format invenio
spelling cern-28461642023-01-27T15:48:13Zdoi:10.1109/ICKII55100.2022.9983580http://cds.cern.ch/record/2846164engSrithai, ThaninBarroso, Vasco ChibantePhunchongharn, PhondComputing Resource Optimization for a Log Monitoring SystemComputing and ComputersInformation Transfer and ManagementA Large Ion Collider Experiment (ALICE) at the Large Hadron Collider (LHC) in the European Organization for Nuclear Research (CERN) laboratory was built to study heavy-ion collisions and the properties of the quark-gluon plasma. The Online and Offline (O2) software systems of the experiment generate a huge amount of log data that is used for monitoring to detect a potential system failure. Elasticsearch was selected as a log storage and search engine for the monitoring system. One of the main problems is how to allocate the computing resources for Elasticsearch while minimizing cost and satisfying performance thresholds, i.e., throughput). Moreover, lacking knowledge of the search engine's behavior makes it difficult to find the best configuration. The exhaustive search method is a potential approach for solving. However, it is not practical since it consumes a lot of time and computing resources. Due to the limited resources, Bayesian optimization is applied as a solution. The Bayesian method requires only a few samples to create a surrogate function that roughly represents the objective function, i.e., minimizing cost while satisfying the performance needs. Then, the method explores only the area where the optimal solution exists with a high probability. The results show that Bayesian optimization provides the optimal or near-optimal computing resource configuration for given benchmark experiments while requiring only about half of the evaluations compared to other methods, e.g., exhaustive search, regression, and machine learning. The impact of several acquisition functions and initial sample generators were studied in order to find the best solution. These insights can help system operators search for an optimal computing resource configuration quickly and efficiently.oai:cds.cern.ch:28461642022
spellingShingle Computing and Computers
Information Transfer and Management
Srithai, Thanin
Barroso, Vasco Chibante
Phunchongharn, Phond
Computing Resource Optimization for a Log Monitoring System
title Computing Resource Optimization for a Log Monitoring System
title_full Computing Resource Optimization for a Log Monitoring System
title_fullStr Computing Resource Optimization for a Log Monitoring System
title_full_unstemmed Computing Resource Optimization for a Log Monitoring System
title_short Computing Resource Optimization for a Log Monitoring System
title_sort computing resource optimization for a log monitoring system
topic Computing and Computers
Information Transfer and Management
url https://dx.doi.org/10.1109/ICKII55100.2022.9983580
http://cds.cern.ch/record/2846164
work_keys_str_mv AT srithaithanin computingresourceoptimizationforalogmonitoringsystem
AT barrosovascochibante computingresourceoptimizationforalogmonitoringsystem
AT phunchongharnphond computingresourceoptimizationforalogmonitoringsystem