Cargando…
Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue
The ATLAS experiment collects billions of events per year of data-taking, and processes them to make them available for physics analysis in several different formats. An even larger amount of events is in addition simulated according to physics and detector models and then reconstructed and analysed...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2015
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2055281 |
_version_ | 1780948283314667520 |
---|---|
author | Favareto, Andrea Barberis, Dario Cardenas Zarate, Simon Ernesto Cranshaw, Jack Fernandez Casani, Alvaro Gallas, Elizabeth Gonzalez de la Hoz, Santiago Hrivnac, Julius Malon, David Prokoshin, Fedor Salt, Jose Sanchez, Javier Toebbicke, Rainer Yuan, Ruijun Garcia Montoro, Carlos |
author_facet | Favareto, Andrea Barberis, Dario Cardenas Zarate, Simon Ernesto Cranshaw, Jack Fernandez Casani, Alvaro Gallas, Elizabeth Gonzalez de la Hoz, Santiago Hrivnac, Julius Malon, David Prokoshin, Fedor Salt, Jose Sanchez, Javier Toebbicke, Rainer Yuan, Ruijun Garcia Montoro, Carlos |
author_sort | Favareto, Andrea |
collection | CERN |
description | The ATLAS experiment collects billions of events per year of data-taking, and processes them to make them available for physics analysis in several different formats. An even larger amount of events is in addition simulated according to physics and detector models and then reconstructed and analysed to be compared to real events. The EventIndex is a catalogue of all events in each production stage; it includes for each event a few identification parameters, some basic non-mutable information coming from the online system, and the references to the files that contain the event in each format (plus the internal pointers to the event within each file for quick retrieval). Each EventIndex record is logically simple but the system has to hold many tens of billions of records, all equally important. The Hadoop technology was selected at the start of the EventIndex project development in 2012 and proved to be robust and flexible to accommodate this kind of information; both the insertion times and query response times are acceptable for the continuous and automatic operation that started in spring 2015. This talk will describe the EventIndex data input and organisation in Hadoop and explain the operational challenges that were overcome in order to achieve the expected good performance. |
id | cern-2055281 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2015 |
record_format | invenio |
spelling | cern-20552812019-09-30T06:29:59Zhttp://cds.cern.ch/record/2055281engFavareto, AndreaBarberis, DarioCardenas Zarate, Simon ErnestoCranshaw, JackFernandez Casani, AlvaroGallas, ElizabethGonzalez de la Hoz, SantiagoHrivnac, JuliusMalon, DavidProkoshin, FedorSalt, JoseSanchez, JavierToebbicke, RainerYuan, RuijunGarcia Montoro, CarlosUse of the Hadoop structured storage tools for the ATLAS EventIndex event catalogueParticle Physics - ExperimentThe ATLAS experiment collects billions of events per year of data-taking, and processes them to make them available for physics analysis in several different formats. An even larger amount of events is in addition simulated according to physics and detector models and then reconstructed and analysed to be compared to real events. The EventIndex is a catalogue of all events in each production stage; it includes for each event a few identification parameters, some basic non-mutable information coming from the online system, and the references to the files that contain the event in each format (plus the internal pointers to the event within each file for quick retrieval). Each EventIndex record is logically simple but the system has to hold many tens of billions of records, all equally important. The Hadoop technology was selected at the start of the EventIndex project development in 2012 and proved to be robust and flexible to accommodate this kind of information; both the insertion times and query response times are acceptable for the continuous and automatic operation that started in spring 2015. This talk will describe the EventIndex data input and organisation in Hadoop and explain the operational challenges that were overcome in order to achieve the expected good performance.ATL-SOFT-PROC-2015-059oai:cds.cern.ch:20552812015-09-27 |
spellingShingle | Particle Physics - Experiment Favareto, Andrea Barberis, Dario Cardenas Zarate, Simon Ernesto Cranshaw, Jack Fernandez Casani, Alvaro Gallas, Elizabeth Gonzalez de la Hoz, Santiago Hrivnac, Julius Malon, David Prokoshin, Fedor Salt, Jose Sanchez, Javier Toebbicke, Rainer Yuan, Ruijun Garcia Montoro, Carlos Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue |
title | Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue |
title_full | Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue |
title_fullStr | Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue |
title_full_unstemmed | Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue |
title_short | Use of the Hadoop structured storage tools for the ATLAS EventIndex event catalogue |
title_sort | use of the hadoop structured storage tools for the atlas eventindex event catalogue |
topic | Particle Physics - Experiment |
url | http://cds.cern.ch/record/2055281 |
work_keys_str_mv | AT favaretoandrea useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT barberisdario useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT cardenaszaratesimonernesto useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT cranshawjack useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT fernandezcasanialvaro useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT gallaselizabeth useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT gonzalezdelahozsantiago useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT hrivnacjulius useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT malondavid useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT prokoshinfedor useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT saltjose useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT sanchezjavier useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT toebbickerainer useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT yuanruijun useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue AT garciamontorocarlos useofthehadoopstructuredstoragetoolsfortheatlaseventindexeventcatalogue |