Cargando…

The ATLAS EventIndex and its evolution based on Apache Kudu storage

The ATLAS experiment produced hundreds of petabytes of data and expects to have one order of magnitude more in the future. This data are spread among hundreds of computing Grid sites around the world. The EventIndex catalogues the basic elements of these data: real and simulated events. It provides...

Descripción completa

Detalles Bibliográficos
Autores principales: Barberis, Dario, Prokoshin, Fedor, Alexandrov, Evgeny, Aleksandrov, Igor, Baranowski, Zbigniew, Canali, Luca, Dimitrov, Gancho, Fernandez Casani, Alvaro, Gallas, Elizabeth, Garcia Montoro, Carlos, Gonzalez de la Hoz, Santiago, Hrivnac, Julius, Iakovlev, Alexander, Kazymov, Andrei, Mineev, Mikhail, Rybkin, Grigori, Sánchez, Javier, Salt, José, Vasileva, Petya Tsvetanova, Villaplana Perez, Miguel
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:http://cds.cern.ch/record/2646132
_version_ 1780960477715628032
author Barberis, Dario
Prokoshin, Fedor
Alexandrov, Evgeny
Aleksandrov, Igor
Baranowski, Zbigniew
Canali, Luca
Dimitrov, Gancho
Fernandez Casani, Alvaro
Gallas, Elizabeth
Garcia Montoro, Carlos
Gonzalez de la Hoz, Santiago
Hrivnac, Julius
Iakovlev, Alexander
Kazymov, Andrei
Mineev, Mikhail
Rybkin, Grigori
Sánchez, Javier
Salt, José
Vasileva, Petya Tsvetanova
Villaplana Perez, Miguel
author_facet Barberis, Dario
Prokoshin, Fedor
Alexandrov, Evgeny
Aleksandrov, Igor
Baranowski, Zbigniew
Canali, Luca
Dimitrov, Gancho
Fernandez Casani, Alvaro
Gallas, Elizabeth
Garcia Montoro, Carlos
Gonzalez de la Hoz, Santiago
Hrivnac, Julius
Iakovlev, Alexander
Kazymov, Andrei
Mineev, Mikhail
Rybkin, Grigori
Sánchez, Javier
Salt, José
Vasileva, Petya Tsvetanova
Villaplana Perez, Miguel
author_sort Barberis, Dario
collection CERN
description The ATLAS experiment produced hundreds of petabytes of data and expects to have one order of magnitude more in the future. This data are spread among hundreds of computing Grid sites around the world. The EventIndex catalogues the basic elements of these data: real and simulated events. It provides the means to select and access event data in the ATLAS distributed storage system, and provides support for completeness and consistency checks and data overlap studies. The EventIndex employs various data handling technologies like Hadoop and Oracle databases, and is integrated with other elements of the ATLAS distributed computing infrastructure, including systems for data, metadata, and production management (AMI, Rucio and PANDA). The project is in operation since the start of LHC Run 2 in 2015, and is in permanent development in order to fit the analysis and production demands and follow technology evolutions. The main data store in Hadoop, based on MapFiles and HBase, can work for the rest of Run 2 but new solutions are explored for the future. Kudu offers an interesting environment, with a mixture of BigData and relational database features, which looked promising at the design level and is now used to build a prototype to measure the scaling capabilities as a function of data input rates, total data volumes and data query and retrieval rates. An extension of the EventIndex functionalities to support the concept of Virtual Datasets produced additional requirements that are tested on the same Kudu prototype, in order to estimate the system performance and response times for different internal data organisations. This paper reports on the current system performance and on the first measurements of the new prototype based on Kudu.
id cern-2646132
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling cern-26461322019-09-30T06:29:59Zhttp://cds.cern.ch/record/2646132engBarberis, DarioProkoshin, FedorAlexandrov, EvgenyAleksandrov, IgorBaranowski, ZbigniewCanali, LucaDimitrov, GanchoFernandez Casani, AlvaroGallas, ElizabethGarcia Montoro, CarlosGonzalez de la Hoz, SantiagoHrivnac, JuliusIakovlev, AlexanderKazymov, AndreiMineev, MikhailRybkin, GrigoriSánchez, JavierSalt, JoséVasileva, Petya TsvetanovaVillaplana Perez, MiguelThe ATLAS EventIndex and its evolution based on Apache Kudu storageParticle Physics - ExperimentThe ATLAS experiment produced hundreds of petabytes of data and expects to have one order of magnitude more in the future. This data are spread among hundreds of computing Grid sites around the world. The EventIndex catalogues the basic elements of these data: real and simulated events. It provides the means to select and access event data in the ATLAS distributed storage system, and provides support for completeness and consistency checks and data overlap studies. The EventIndex employs various data handling technologies like Hadoop and Oracle databases, and is integrated with other elements of the ATLAS distributed computing infrastructure, including systems for data, metadata, and production management (AMI, Rucio and PANDA). The project is in operation since the start of LHC Run 2 in 2015, and is in permanent development in order to fit the analysis and production demands and follow technology evolutions. The main data store in Hadoop, based on MapFiles and HBase, can work for the rest of Run 2 but new solutions are explored for the future. Kudu offers an interesting environment, with a mixture of BigData and relational database features, which looked promising at the design level and is now used to build a prototype to measure the scaling capabilities as a function of data input rates, total data volumes and data query and retrieval rates. An extension of the EventIndex functionalities to support the concept of Virtual Datasets produced additional requirements that are tested on the same Kudu prototype, in order to estimate the system performance and response times for different internal data organisations. This paper reports on the current system performance and on the first measurements of the new prototype based on Kudu.ATL-SOFT-PROC-2018-017oai:cds.cern.ch:26461322018-11-06
spellingShingle Particle Physics - Experiment
Barberis, Dario
Prokoshin, Fedor
Alexandrov, Evgeny
Aleksandrov, Igor
Baranowski, Zbigniew
Canali, Luca
Dimitrov, Gancho
Fernandez Casani, Alvaro
Gallas, Elizabeth
Garcia Montoro, Carlos
Gonzalez de la Hoz, Santiago
Hrivnac, Julius
Iakovlev, Alexander
Kazymov, Andrei
Mineev, Mikhail
Rybkin, Grigori
Sánchez, Javier
Salt, José
Vasileva, Petya Tsvetanova
Villaplana Perez, Miguel
The ATLAS EventIndex and its evolution based on Apache Kudu storage
title The ATLAS EventIndex and its evolution based on Apache Kudu storage
title_full The ATLAS EventIndex and its evolution based on Apache Kudu storage
title_fullStr The ATLAS EventIndex and its evolution based on Apache Kudu storage
title_full_unstemmed The ATLAS EventIndex and its evolution based on Apache Kudu storage
title_short The ATLAS EventIndex and its evolution based on Apache Kudu storage
title_sort atlas eventindex and its evolution based on apache kudu storage
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2646132
work_keys_str_mv AT barberisdario theatlaseventindexanditsevolutionbasedonapachekudustorage
AT prokoshinfedor theatlaseventindexanditsevolutionbasedonapachekudustorage
AT alexandrovevgeny theatlaseventindexanditsevolutionbasedonapachekudustorage
AT aleksandrovigor theatlaseventindexanditsevolutionbasedonapachekudustorage
AT baranowskizbigniew theatlaseventindexanditsevolutionbasedonapachekudustorage
AT canaliluca theatlaseventindexanditsevolutionbasedonapachekudustorage
AT dimitrovgancho theatlaseventindexanditsevolutionbasedonapachekudustorage
AT fernandezcasanialvaro theatlaseventindexanditsevolutionbasedonapachekudustorage
AT gallaselizabeth theatlaseventindexanditsevolutionbasedonapachekudustorage
AT garciamontorocarlos theatlaseventindexanditsevolutionbasedonapachekudustorage
AT gonzalezdelahozsantiago theatlaseventindexanditsevolutionbasedonapachekudustorage
AT hrivnacjulius theatlaseventindexanditsevolutionbasedonapachekudustorage
AT iakovlevalexander theatlaseventindexanditsevolutionbasedonapachekudustorage
AT kazymovandrei theatlaseventindexanditsevolutionbasedonapachekudustorage
AT mineevmikhail theatlaseventindexanditsevolutionbasedonapachekudustorage
AT rybkingrigori theatlaseventindexanditsevolutionbasedonapachekudustorage
AT sanchezjavier theatlaseventindexanditsevolutionbasedonapachekudustorage
AT saltjose theatlaseventindexanditsevolutionbasedonapachekudustorage
AT vasilevapetyatsvetanova theatlaseventindexanditsevolutionbasedonapachekudustorage
AT villaplanaperezmiguel theatlaseventindexanditsevolutionbasedonapachekudustorage
AT barberisdario atlaseventindexanditsevolutionbasedonapachekudustorage
AT prokoshinfedor atlaseventindexanditsevolutionbasedonapachekudustorage
AT alexandrovevgeny atlaseventindexanditsevolutionbasedonapachekudustorage
AT aleksandrovigor atlaseventindexanditsevolutionbasedonapachekudustorage
AT baranowskizbigniew atlaseventindexanditsevolutionbasedonapachekudustorage
AT canaliluca atlaseventindexanditsevolutionbasedonapachekudustorage
AT dimitrovgancho atlaseventindexanditsevolutionbasedonapachekudustorage
AT fernandezcasanialvaro atlaseventindexanditsevolutionbasedonapachekudustorage
AT gallaselizabeth atlaseventindexanditsevolutionbasedonapachekudustorage
AT garciamontorocarlos atlaseventindexanditsevolutionbasedonapachekudustorage
AT gonzalezdelahozsantiago atlaseventindexanditsevolutionbasedonapachekudustorage
AT hrivnacjulius atlaseventindexanditsevolutionbasedonapachekudustorage
AT iakovlevalexander atlaseventindexanditsevolutionbasedonapachekudustorage
AT kazymovandrei atlaseventindexanditsevolutionbasedonapachekudustorage
AT mineevmikhail atlaseventindexanditsevolutionbasedonapachekudustorage
AT rybkingrigori atlaseventindexanditsevolutionbasedonapachekudustorage
AT sanchezjavier atlaseventindexanditsevolutionbasedonapachekudustorage
AT saltjose atlaseventindexanditsevolutionbasedonapachekudustorage
AT vasilevapetyatsvetanova atlaseventindexanditsevolutionbasedonapachekudustorage
AT villaplanaperezmiguel atlaseventindexanditsevolutionbasedonapachekudustorage