Cargando…
Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC
Storage is identified as one of the main challenges for WLCG in the next decade in the computing strategy document for HL-LHC [1]. Extrapolating todays computing models, the ATLAS and CMS experiments alone would need one order of magnitude more storage resources than what could be provided by the fu...
Autores principales: | , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1109/eScience.2018.00080 http://cds.cern.ch/record/2674723 |
_version_ | 1780962630941278208 |
---|---|
author | Schovancová, Jaroslava Campana, Simone Espinal Curull, Xavier Girone, Maria Kadochnikov, Ivan McCance, Gavin John |
author_facet | Schovancová, Jaroslava Campana, Simone Espinal Curull, Xavier Girone, Maria Kadochnikov, Ivan McCance, Gavin John |
author_sort | Schovancová, Jaroslava |
collection | CERN |
description | Storage is identified as one of the main challenges for WLCG in the next decade in the computing strategy document for HL-LHC [1]. Extrapolating todays computing models, the ATLAS and CMS experiments alone would need one order of magnitude more storage resources than what could be provided by the funding agencies. Organization and consolidation of storage and evolution of the compute facilities will be central in addressing the possible resources shortage. In this contribution we describe the architecture of a prototype of a WLCG data lake for HL-LHC. A WLCG data lake aims to provide a geographically distributed storage service, distributed across large data centres interconnected by fast networks with low latency. We present methodology used to measure and understand the performance of a WLCG data lake prototype, in order to compare event throughput at the same cost at the same compute facilities backed by the traditional storage services and backed by the WLCG data lake. We will discuss various possible data processing models w.r.t. network latency, available storage media, and data caching approaches. We will present benchmarks for storage service and compute performance. |
id | oai-inspirehep.net-1721166 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2018 |
record_format | invenio |
spelling | oai-inspirehep.net-17211662019-09-30T06:29:59Zdoi:10.1109/eScience.2018.00080http://cds.cern.ch/record/2674723engSchovancová, JaroslavaCampana, SimoneEspinal Curull, XavierGirone, MariaKadochnikov, IvanMcCance, Gavin JohnUnderstanding the Performance of a Prototype of a WLCG Data Lake for HL-LHCComputing and ComputersDetectors and Experimental TechniquesStorage is identified as one of the main challenges for WLCG in the next decade in the computing strategy document for HL-LHC [1]. Extrapolating todays computing models, the ATLAS and CMS experiments alone would need one order of magnitude more storage resources than what could be provided by the funding agencies. Organization and consolidation of storage and evolution of the compute facilities will be central in addressing the possible resources shortage. In this contribution we describe the architecture of a prototype of a WLCG data lake for HL-LHC. A WLCG data lake aims to provide a geographically distributed storage service, distributed across large data centres interconnected by fast networks with low latency. We present methodology used to measure and understand the performance of a WLCG data lake prototype, in order to compare event throughput at the same cost at the same compute facilities backed by the traditional storage services and backed by the WLCG data lake. We will discuss various possible data processing models w.r.t. network latency, available storage media, and data caching approaches. We will present benchmarks for storage service and compute performance.oai:inspirehep.net:17211662018 |
spellingShingle | Computing and Computers Detectors and Experimental Techniques Schovancová, Jaroslava Campana, Simone Espinal Curull, Xavier Girone, Maria Kadochnikov, Ivan McCance, Gavin John Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC |
title | Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC |
title_full | Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC |
title_fullStr | Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC |
title_full_unstemmed | Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC |
title_short | Understanding the Performance of a Prototype of a WLCG Data Lake for HL-LHC |
title_sort | understanding the performance of a prototype of a wlcg data lake for hl-lhc |
topic | Computing and Computers Detectors and Experimental Techniques |
url | https://dx.doi.org/10.1109/eScience.2018.00080 http://cds.cern.ch/record/2674723 |
work_keys_str_mv | AT schovancovajaroslava understandingtheperformanceofaprototypeofawlcgdatalakeforhllhc AT campanasimone understandingtheperformanceofaprototypeofawlcgdatalakeforhllhc AT espinalcurullxavier understandingtheperformanceofaprototypeofawlcgdatalakeforhllhc AT gironemaria understandingtheperformanceofaprototypeofawlcgdatalakeforhllhc AT kadochnikovivan understandingtheperformanceofaprototypeofawlcgdatalakeforhllhc AT mccancegavinjohn understandingtheperformanceofaprototypeofawlcgdatalakeforhllhc |