Cargando…
A Serverless Engine for High Energy Physics Distributed Analysis
The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled tradit...
Autores principales: | , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2022
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1109/CCGrid54584.2022.00067 http://cds.cern.ch/record/2815205 |
_version_ | 1780973496038326272 |
---|---|
author | Kuśnierz, Jacek Padulano, Vincenzo Eduardo Malawski, Maciej Burkiewicz, Kamil Saavedra, Enric Tejedor Alonso-Jordá, Pedro Pitt, Michael Avati, Valentina |
author_facet | Kuśnierz, Jacek Padulano, Vincenzo Eduardo Malawski, Maciej Burkiewicz, Kamil Saavedra, Enric Tejedor Alonso-Jordá, Pedro Pitt, Michael Avati, Valentina |
author_sort | Kuśnierz, Jacek |
collection | CERN |
description | The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads. |
id | cern-2815205 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2022 |
record_format | invenio |
spelling | cern-28152052023-01-31T10:48:55Zdoi:10.1109/CCGrid54584.2022.00067http://cds.cern.ch/record/2815205engKuśnierz, JacekPadulano, Vincenzo EduardoMalawski, MaciejBurkiewicz, KamilSaavedra, Enric TejedorAlonso-Jordá, PedroPitt, MichaelAvati, ValentinaA Serverless Engine for High Energy Physics Distributed Analysiscs.DCComputing and ComputersThe Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads.The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads.arXiv:2206.00942oai:cds.cern.ch:28152052022-06-02 |
spellingShingle | cs.DC Computing and Computers Kuśnierz, Jacek Padulano, Vincenzo Eduardo Malawski, Maciej Burkiewicz, Kamil Saavedra, Enric Tejedor Alonso-Jordá, Pedro Pitt, Michael Avati, Valentina A Serverless Engine for High Energy Physics Distributed Analysis |
title | A Serverless Engine for High Energy Physics Distributed Analysis |
title_full | A Serverless Engine for High Energy Physics Distributed Analysis |
title_fullStr | A Serverless Engine for High Energy Physics Distributed Analysis |
title_full_unstemmed | A Serverless Engine for High Energy Physics Distributed Analysis |
title_short | A Serverless Engine for High Energy Physics Distributed Analysis |
title_sort | serverless engine for high energy physics distributed analysis |
topic | cs.DC Computing and Computers |
url | https://dx.doi.org/10.1109/CCGrid54584.2022.00067 http://cds.cern.ch/record/2815205 |
work_keys_str_mv | AT kusnierzjacek aserverlessengineforhighenergyphysicsdistributedanalysis AT padulanovincenzoeduardo aserverlessengineforhighenergyphysicsdistributedanalysis AT malawskimaciej aserverlessengineforhighenergyphysicsdistributedanalysis AT burkiewiczkamil aserverlessengineforhighenergyphysicsdistributedanalysis AT saavedraenrictejedor aserverlessengineforhighenergyphysicsdistributedanalysis AT alonsojordapedro aserverlessengineforhighenergyphysicsdistributedanalysis AT pittmichael aserverlessengineforhighenergyphysicsdistributedanalysis AT avativalentina aserverlessengineforhighenergyphysicsdistributedanalysis AT kusnierzjacek serverlessengineforhighenergyphysicsdistributedanalysis AT padulanovincenzoeduardo serverlessengineforhighenergyphysicsdistributedanalysis AT malawskimaciej serverlessengineforhighenergyphysicsdistributedanalysis AT burkiewiczkamil serverlessengineforhighenergyphysicsdistributedanalysis AT saavedraenrictejedor serverlessengineforhighenergyphysicsdistributedanalysis AT alonsojordapedro serverlessengineforhighenergyphysicsdistributedanalysis AT pittmichael serverlessengineforhighenergyphysicsdistributedanalysis AT avativalentina serverlessengineforhighenergyphysicsdistributedanalysis |