Cargando…

Quantitative Analysis of Data Caching for the HL-LHC Data-lake

Given the development of the High Luminosity Large Hadron Collider ( HL-LHC ), the Worldwide LHC Computing Grid ( WLCG ) will face unprecedented computing challenges. The amount and complexity of the data generated from the different experiments at CERN will increase, and so the WLCG has developed a...

Descripción completa

Detalles Bibliográficos
Autor principal: Umana Chacon, Irvin Jadurier
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:http://cds.cern.ch/record/2642553
_version_ 1780960306836537344
author Umana Chacon, Irvin Jadurier
author_facet Umana Chacon, Irvin Jadurier
author_sort Umana Chacon, Irvin Jadurier
collection CERN
description Given the development of the High Luminosity Large Hadron Collider ( HL-LHC ), the Worldwide LHC Computing Grid ( WLCG ) will face unprecedented computing challenges. The amount and complexity of the data generated from the different experiments at CERN will increase, and so the WLCG has developed a strategy to handle this through caching and the creation of data-lakes. This paper focuses on a quantitative analysis of this caching by exploring the relationship between cache size and hit-rate. This, through a simulation based on the interactions of a Tier 2 site and the Data Center. The results that show that, at least for these two sites, increasing the cache’s capacity above 0.45 petabytes will not increase above a 41% hit-rate. These results are based on the Data Center’s interactions registered in log files from May 31 to June 25, 2018.
id cern-2642553
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling cern-26425532019-09-30T06:29:59Zhttp://cds.cern.ch/record/2642553engUmana Chacon, Irvin JadurierQuantitative Analysis of Data Caching for the HL-LHC Data-lakePhysics in GeneralGiven the development of the High Luminosity Large Hadron Collider ( HL-LHC ), the Worldwide LHC Computing Grid ( WLCG ) will face unprecedented computing challenges. The amount and complexity of the data generated from the different experiments at CERN will increase, and so the WLCG has developed a strategy to handle this through caching and the creation of data-lakes. This paper focuses on a quantitative analysis of this caching by exploring the relationship between cache size and hit-rate. This, through a simulation based on the interactions of a Tier 2 site and the Data Center. The results that show that, at least for these two sites, increasing the cache’s capacity above 0.45 petabytes will not increase above a 41% hit-rate. These results are based on the Data Center’s interactions registered in log files from May 31 to June 25, 2018.CERN-STUDENTS-Note-2018-191oai:cds.cern.ch:26425532018-10-10
spellingShingle Physics in General
Umana Chacon, Irvin Jadurier
Quantitative Analysis of Data Caching for the HL-LHC Data-lake
title Quantitative Analysis of Data Caching for the HL-LHC Data-lake
title_full Quantitative Analysis of Data Caching for the HL-LHC Data-lake
title_fullStr Quantitative Analysis of Data Caching for the HL-LHC Data-lake
title_full_unstemmed Quantitative Analysis of Data Caching for the HL-LHC Data-lake
title_short Quantitative Analysis of Data Caching for the HL-LHC Data-lake
title_sort quantitative analysis of data caching for the hl-lhc data-lake
topic Physics in General
url http://cds.cern.ch/record/2642553
work_keys_str_mv AT umanachaconirvinjadurier quantitativeanalysisofdatacachingforthehllhcdatalake