Cargando…
Preparing distributed computing operations for HL-LHC era with Operational Intelligence
The Operational Intelligence (OpInt) project is a joint effort from various WLCG communities aimed at increasing the level of automation in computing operations and reducing human interventions. The currently deployed systems have proven to be mature and capable of meeting the experiment goals, by a...
Autores principales: | , , , , , , , , , , , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2765704 |
_version_ | 1780971157251424256 |
---|---|
author | Di Girolamo, Alessandro Legger, Federica Paparrigopoulos, Panos Schovancova, Jaroslava Beermann, Thomas Boehler, Michael Bonacorsi, Daniele Clissa, Luca Diotalevi, Tommaso Giommi, Luca Giordano, Domenico Hohn, David Javurek, Tomas Jezequel, Stephane Kuznetsov, Valentin Y Lassnig, Mario Olocco, Micol Padolski, Siarhei Rinaldi, Lorenzo Sharma, Mayank Decker De Sousa, Leticia |
author_facet | Di Girolamo, Alessandro Legger, Federica Paparrigopoulos, Panos Schovancova, Jaroslava Beermann, Thomas Boehler, Michael Bonacorsi, Daniele Clissa, Luca Diotalevi, Tommaso Giommi, Luca Giordano, Domenico Hohn, David Javurek, Tomas Jezequel, Stephane Kuznetsov, Valentin Y Lassnig, Mario Olocco, Micol Padolski, Siarhei Rinaldi, Lorenzo Sharma, Mayank Decker De Sousa, Leticia |
author_sort | Di Girolamo, Alessandro |
collection | CERN |
description | The Operational Intelligence (OpInt) project is a joint effort from various WLCG communities aimed at increasing the level of automation in computing operations and reducing human interventions. The currently deployed systems have proven to be mature and capable of meeting the experiment goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters and operational teams is needed to efficiently manage such heterogeneous infrastructures. Under the scope of the OpInt project experts from most of the relevant areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. Discussions have led to a number of ideas on how to achieve our goals and the development of solutions has started. In this contribution, we will report on the development of a suite of OpInt services to cover various use cases: workload management, data management, and site operations. |
id | cern-2765704 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2021 |
record_format | invenio |
spelling | cern-27657042022-08-23T09:03:26Zhttp://cds.cern.ch/record/2765704engDi Girolamo, AlessandroLegger, FedericaPaparrigopoulos, PanosSchovancova, JaroslavaBeermann, ThomasBoehler, MichaelBonacorsi, DanieleClissa, LucaDiotalevi, TommasoGiommi, LucaGiordano, DomenicoHohn, DavidJavurek, TomasJezequel, StephaneKuznetsov, Valentin YLassnig, MarioOlocco, MicolPadolski, SiarheiRinaldi, LorenzoSharma, MayankDecker De Sousa, LeticiaPreparing distributed computing operations for HL-LHC era with Operational IntelligenceParticle Physics - ExperimentThe Operational Intelligence (OpInt) project is a joint effort from various WLCG communities aimed at increasing the level of automation in computing operations and reducing human interventions. The currently deployed systems have proven to be mature and capable of meeting the experiment goals, by allowing timely delivery of scientific results. However, a substantial number of interventions from software developers, shifters and operational teams is needed to efficiently manage such heterogeneous infrastructures. Under the scope of the OpInt project experts from most of the relevant areas have gathered to propose and work on “smart” solutions. Machine learning, data mining, log analysis, and anomaly detection are only some of the tools we have evaluated for our use cases. Discussions have led to a number of ideas on how to achieve our goals and the development of solutions has started. In this contribution, we will report on the development of a suite of OpInt services to cover various use cases: workload management, data management, and site operations.ATL-SOFT-SLIDE-2021-139oai:cds.cern.ch:27657042021-05-03 |
spellingShingle | Particle Physics - Experiment Di Girolamo, Alessandro Legger, Federica Paparrigopoulos, Panos Schovancova, Jaroslava Beermann, Thomas Boehler, Michael Bonacorsi, Daniele Clissa, Luca Diotalevi, Tommaso Giommi, Luca Giordano, Domenico Hohn, David Javurek, Tomas Jezequel, Stephane Kuznetsov, Valentin Y Lassnig, Mario Olocco, Micol Padolski, Siarhei Rinaldi, Lorenzo Sharma, Mayank Decker De Sousa, Leticia Preparing distributed computing operations for HL-LHC era with Operational Intelligence |
title | Preparing distributed computing operations for HL-LHC era with Operational Intelligence |
title_full | Preparing distributed computing operations for HL-LHC era with Operational Intelligence |
title_fullStr | Preparing distributed computing operations for HL-LHC era with Operational Intelligence |
title_full_unstemmed | Preparing distributed computing operations for HL-LHC era with Operational Intelligence |
title_short | Preparing distributed computing operations for HL-LHC era with Operational Intelligence |
title_sort | preparing distributed computing operations for hl-lhc era with operational intelligence |
topic | Particle Physics - Experiment |
url | http://cds.cern.ch/record/2765704 |
work_keys_str_mv | AT digirolamoalessandro preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT leggerfederica preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT paparrigopoulospanos preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT schovancovajaroslava preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT beermannthomas preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT boehlermichael preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT bonacorsidaniele preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT clissaluca preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT diotalevitommaso preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT giommiluca preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT giordanodomenico preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT hohndavid preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT javurektomas preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT jezequelstephane preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT kuznetsovvalentiny preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT lassnigmario preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT oloccomicol preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT padolskisiarhei preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT rinaldilorenzo preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT sharmamayank preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence AT deckerdesousaleticia preparingdistributedcomputingoperationsforhllhcerawithoperationalintelligence |