Cargando…
ATLAS EventIndex General Dataflow and Monitoring Infrastructure
The ATLAS EventIndex has been running in production since mid-2015, reliably collecting information worldwide about all produced events and storing them in a central Hadoop infrastructure at CERN. A subset of this information is copied to an Oracle relational database for fast datasets discovery, ev...
Autores principales: | , , , , , , , , , , |
---|---|
Lenguaje: | eng |
Publicado: |
2016
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2218034 |
_version_ | 1780952134919913472 |
---|---|
author | Barberis, Dario Favareto, Andrea Fernandez Casani, Alvaro Garcia Montoro, Carlos Gonzalez de la Hoz, Santiago Hrivnac, Julius Prokoshin, Fedor Salt, Jose Sanchez, Javier Toebbicke, Rainer Yuan, Ruijun |
author_facet | Barberis, Dario Favareto, Andrea Fernandez Casani, Alvaro Garcia Montoro, Carlos Gonzalez de la Hoz, Santiago Hrivnac, Julius Prokoshin, Fedor Salt, Jose Sanchez, Javier Toebbicke, Rainer Yuan, Ruijun |
author_sort | Barberis, Dario |
collection | CERN |
description | The ATLAS EventIndex has been running in production since mid-2015, reliably collecting information worldwide about all produced events and storing them in a central Hadoop infrastructure at CERN. A subset of this information is copied to an Oracle relational database for fast datasets discovery, event-picking, crosschecks with other ATLAS systems and checks for event duplication. The system design and its optimization is serving event picking from requests of a few events up to scales of tens of thousand of events, and in addition, data consistency checks are performed for large production campaigns. Detecting duplicate events with a scope of physics collections has recently arisen as an important use case. This paper describes the general architecture of the project and the data flow and operation issues, which are addressed by recent developments to improve the throughput of the overall system. In this direction, the data collection system is reducing the usage of the messaging infrastructure to overcome the performance shortcomings detected during production peaks; an object storage approach is instead used to convey the event index information, and messages to signal their location and status. Recent changes in the Producer/Consumer architecture are also presented in detail, as well as the monitoring infrastructure. |
id | cern-2218034 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2016 |
record_format | invenio |
spelling | cern-22180342019-09-30T06:29:59Zhttp://cds.cern.ch/record/2218034engBarberis, DarioFavareto, AndreaFernandez Casani, AlvaroGarcia Montoro, CarlosGonzalez de la Hoz, SantiagoHrivnac, JuliusProkoshin, FedorSalt, JoseSanchez, JavierToebbicke, RainerYuan, RuijunATLAS EventIndex General Dataflow and Monitoring InfrastructureParticle Physics - ExperimentThe ATLAS EventIndex has been running in production since mid-2015, reliably collecting information worldwide about all produced events and storing them in a central Hadoop infrastructure at CERN. A subset of this information is copied to an Oracle relational database for fast datasets discovery, event-picking, crosschecks with other ATLAS systems and checks for event duplication. The system design and its optimization is serving event picking from requests of a few events up to scales of tens of thousand of events, and in addition, data consistency checks are performed for large production campaigns. Detecting duplicate events with a scope of physics collections has recently arisen as an important use case. This paper describes the general architecture of the project and the data flow and operation issues, which are addressed by recent developments to improve the throughput of the overall system. In this direction, the data collection system is reducing the usage of the messaging infrastructure to overcome the performance shortcomings detected during production peaks; an object storage approach is instead used to convey the event index information, and messages to signal their location and status. Recent changes in the Producer/Consumer architecture are also presented in detail, as well as the monitoring infrastructure.ATL-SOFT-SLIDE-2016-694oai:cds.cern.ch:22180342016-09-24 |
spellingShingle | Particle Physics - Experiment Barberis, Dario Favareto, Andrea Fernandez Casani, Alvaro Garcia Montoro, Carlos Gonzalez de la Hoz, Santiago Hrivnac, Julius Prokoshin, Fedor Salt, Jose Sanchez, Javier Toebbicke, Rainer Yuan, Ruijun ATLAS EventIndex General Dataflow and Monitoring Infrastructure |
title | ATLAS EventIndex General Dataflow and Monitoring Infrastructure |
title_full | ATLAS EventIndex General Dataflow and Monitoring Infrastructure |
title_fullStr | ATLAS EventIndex General Dataflow and Monitoring Infrastructure |
title_full_unstemmed | ATLAS EventIndex General Dataflow and Monitoring Infrastructure |
title_short | ATLAS EventIndex General Dataflow and Monitoring Infrastructure |
title_sort | atlas eventindex general dataflow and monitoring infrastructure |
topic | Particle Physics - Experiment |
url | http://cds.cern.ch/record/2218034 |
work_keys_str_mv | AT barberisdario atlaseventindexgeneraldataflowandmonitoringinfrastructure AT favaretoandrea atlaseventindexgeneraldataflowandmonitoringinfrastructure AT fernandezcasanialvaro atlaseventindexgeneraldataflowandmonitoringinfrastructure AT garciamontorocarlos atlaseventindexgeneraldataflowandmonitoringinfrastructure AT gonzalezdelahozsantiago atlaseventindexgeneraldataflowandmonitoringinfrastructure AT hrivnacjulius atlaseventindexgeneraldataflowandmonitoringinfrastructure AT prokoshinfedor atlaseventindexgeneraldataflowandmonitoringinfrastructure AT saltjose atlaseventindexgeneraldataflowandmonitoringinfrastructure AT sanchezjavier atlaseventindexgeneraldataflowandmonitoringinfrastructure AT toebbickerainer atlaseventindexgeneraldataflowandmonitoringinfrastructure AT yuanruijun atlaseventindexgeneraldataflowandmonitoringinfrastructure |