Cargando…

Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider

The ATLAS experiment at the LHC relies on a complex and distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data. The Event Filter (EF) component of the TDAQ system is responsible for executing advanced selection algorithms, reducing the data rate to a leve...

Descripción completa

Detalles Bibliográficos
Autores principales: Avolio, Giuseppe, Cadeddu, Mattia, Hauser, Reiner
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/201921407024
http://cds.cern.ch/record/2642353
_version_ 1780960295971192832
author Avolio, Giuseppe
Cadeddu, Mattia
Hauser, Reiner
author_facet Avolio, Giuseppe
Cadeddu, Mattia
Hauser, Reiner
author_sort Avolio, Giuseppe
collection CERN
description The ATLAS experiment at the LHC relies on a complex and distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data. The Event Filter (EF) component of the TDAQ system is responsible for executing advanced selection algorithms, reducing the data rate to a level suitable for recording to permanent storage. The EF functionality is provided by a computing farm made up of thousands of commodity servers, each executing one or more processes. Moving the EF farm management towards a solution based on software containers is one of the main themes of the ATLAS TDAQ Phase-II upgrades in the area of the online software; it would make it possible to open new possibilities for fault tolerance, reliability and scalability. This paper presents the results of an evaluation of Kubernetes as a possible orchestrator of the ATLAS TDAQ EF computing farm. Kubernetes is a system for advanced management of containerized applications in large clusters. This paper will first highlight some of the technical solutions adopted to run the offline version of today’s EF software in a Docker container. Then it will focus on some scaling performance measurements executed with a cluster of 1000 CPU cores. In particular, this paper will report about the way Kubernetes scales in deploying containers as a function of the cluster size and show how a proper tuning of the Query per Second (QPS) Kubernetes parameter set can improve the scaling of applications in terms of running replicas. Finally, an assessment will be given about the possibility to use Kubernetes as an orchestrator of the EF computing farm in LHC’s Run 4.
id cern-2642353
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling cern-26423532022-08-10T12:22:23Zdoi:10.1051/epjconf/201921407024http://cds.cern.ch/record/2642353engAvolio, GiuseppeCadeddu, MattiaHauser, ReinerEvaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron ColliderParticle Physics - ExperimentThe ATLAS experiment at the LHC relies on a complex and distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data. The Event Filter (EF) component of the TDAQ system is responsible for executing advanced selection algorithms, reducing the data rate to a level suitable for recording to permanent storage. The EF functionality is provided by a computing farm made up of thousands of commodity servers, each executing one or more processes. Moving the EF farm management towards a solution based on software containers is one of the main themes of the ATLAS TDAQ Phase-II upgrades in the area of the online software; it would make it possible to open new possibilities for fault tolerance, reliability and scalability. This paper presents the results of an evaluation of Kubernetes as a possible orchestrator of the ATLAS TDAQ EF computing farm. Kubernetes is a system for advanced management of containerized applications in large clusters. This paper will first highlight some of the technical solutions adopted to run the offline version of today’s EF software in a Docker container. Then it will focus on some scaling performance measurements executed with a cluster of 1000 CPU cores. In particular, this paper will report about the way Kubernetes scales in deploying containers as a function of the cluster size and show how a proper tuning of the Query per Second (QPS) Kubernetes parameter set can improve the scaling of applications in terms of running replicas. Finally, an assessment will be given about the possibility to use Kubernetes as an orchestrator of the EF computing farm in LHC’s Run 4.ATL-DAQ-PROC-2018-022oai:cds.cern.ch:26423532018-10-08
spellingShingle Particle Physics - Experiment
Avolio, Giuseppe
Cadeddu, Mattia
Hauser, Reiner
Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider
title Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider
title_full Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider
title_fullStr Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider
title_full_unstemmed Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider
title_short Evaluating Kubernetes as an orchestrator of the Event Filter computing farm of the Trigger and Data Acquisition system of the ATLAS experiment at the Large Hadron Collider
title_sort evaluating kubernetes as an orchestrator of the event filter computing farm of the trigger and data acquisition system of the atlas experiment at the large hadron collider
topic Particle Physics - Experiment
url https://dx.doi.org/10.1051/epjconf/201921407024
http://cds.cern.ch/record/2642353
work_keys_str_mv AT avoliogiuseppe evaluatingkubernetesasanorchestratoroftheeventfiltercomputingfarmofthetriggeranddataacquisitionsystemoftheatlasexperimentatthelargehadroncollider
AT cadeddumattia evaluatingkubernetesasanorchestratoroftheeventfiltercomputingfarmofthetriggeranddataacquisitionsystemoftheatlasexperimentatthelargehadroncollider
AT hauserreiner evaluatingkubernetesasanorchestratoroftheeventfiltercomputingfarmofthetriggeranddataacquisitionsystemoftheatlasexperimentatthelargehadroncollider