Cargando…

HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex

The ATLAS EventIndex is the global catalogue of all ATLAS real and simulated events. During the LHC long shutdown between Run 2 (2015-2018) and Run 3 (2022-2025) its components were substantially revised, and a new system has been deployed for the start of Run 3 in Spring 2022. The new core storage...

Descripción completa

Detalles Bibliográficos
Autores principales: Garcia Montoro, Carlos, Sanchez Martinez, Francisco Javier, Barberis, Dario, Gonzalez De La Hoz, Santiago, Salt, Jose
Lenguaje:eng
Publicado: 2023
Materias:
Acceso en línea:http://cds.cern.ch/record/2853549
_version_ 1780977219906043904
author Garcia Montoro, Carlos
Sanchez Martinez, Francisco Javier
Barberis, Dario
Gonzalez De La Hoz, Santiago
Salt, Jose
author_facet Garcia Montoro, Carlos
Sanchez Martinez, Francisco Javier
Barberis, Dario
Gonzalez De La Hoz, Santiago
Salt, Jose
author_sort Garcia Montoro, Carlos
collection CERN
description The ATLAS EventIndex is the global catalogue of all ATLAS real and simulated events. During the LHC long shutdown between Run 2 (2015-2018) and Run 3 (2022-2025) its components were substantially revised, and a new system has been deployed for the start of Run 3 in Spring 2022. The new core storage system is based on HBase tables with a Phoenix interface. It allows faster data ingestion rates and scales better than the old system. This paper describes the data collection, the technical design of the core storage, and the properties that make it performant: The compact and optimized design of the events table, which already holds more than 400 billion entries, and all the auxiliary tables; The EventIndex Supervisor, in charge of orchestrating the whole data collection, has been simplified thanks to the loaders, the Spark jobs that load the data into the new core system. The extractors, in charge of preparing the pieces of data that the loaders will put into the final back-end, have been updated too. The data migration from HDFS to HBase and Phoenix is also described.
id cern-2853549
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2023
record_format invenio
spelling cern-28535492023-03-31T09:10:03Zhttp://cds.cern.ch/record/2853549engGarcia Montoro, CarlosSanchez Martinez, Francisco JavierBarberis, DarioGonzalez De La Hoz, SantiagoSalt, JoseHBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndexParticle Physics - ExperimentThe ATLAS EventIndex is the global catalogue of all ATLAS real and simulated events. During the LHC long shutdown between Run 2 (2015-2018) and Run 3 (2022-2025) its components were substantially revised, and a new system has been deployed for the start of Run 3 in Spring 2022. The new core storage system is based on HBase tables with a Phoenix interface. It allows faster data ingestion rates and scales better than the old system. This paper describes the data collection, the technical design of the core storage, and the properties that make it performant: The compact and optimized design of the events table, which already holds more than 400 billion entries, and all the auxiliary tables; The EventIndex Supervisor, in charge of orchestrating the whole data collection, has been simplified thanks to the loaders, the Spark jobs that load the data into the new core system. The extractors, in charge of preparing the pieces of data that the loaders will put into the final back-end, have been updated too. The data migration from HDFS to HBase and Phoenix is also described.ATL-SOFT-SLIDE-2023-033oai:cds.cern.ch:28535492023-03-26
spellingShingle Particle Physics - Experiment
Garcia Montoro, Carlos
Sanchez Martinez, Francisco Javier
Barberis, Dario
Gonzalez De La Hoz, Santiago
Salt, Jose
HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex
title HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex
title_full HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex
title_fullStr HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex
title_full_unstemmed HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex
title_short HBase / Phoenix-based Data Collection and Storage for the ATLAS EventIndex
title_sort hbase / phoenix-based data collection and storage for the atlas eventindex
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2853549
work_keys_str_mv AT garciamontorocarlos hbasephoenixbaseddatacollectionandstoragefortheatlaseventindex
AT sanchezmartinezfranciscojavier hbasephoenixbaseddatacollectionandstoragefortheatlaseventindex
AT barberisdario hbasephoenixbaseddatacollectionandstoragefortheatlaseventindex
AT gonzalezdelahozsantiago hbasephoenixbaseddatacollectionandstoragefortheatlaseventindex
AT saltjose hbasephoenixbaseddatacollectionandstoragefortheatlaseventindex