Cargando…

Distributed Data Collection for the ATLAS EventIndex

The ATLAS EventIndex contains records of all events processed by ATLAS, in all processing stages. These records include the references to the files containing each event (the GUID of the file) and the internal “pointer” to each event in the file. This information is collected by all jobs that run at...

Descripción completa

Detalles Bibliográficos
Autores principales: Sánchez, Javier, Fernandez Casani, Alvaro, Gonzalez de la Hoz, Santiago
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/664/4/042046
http://cds.cern.ch/record/2016348
Descripción
Sumario:The ATLAS EventIndex contains records of all events processed by ATLAS, in all processing stages. These records include the references to the files containing each event (the GUID of the file) and the internal “pointer” to each event in the file. This information is collected by all jobs that run at Tier-0 or on the Grid and process ATLAS events. Each job produces a snippet of information for each permanent output file. This information is packed and transferred to a central broker at CERN using an ActiveMQ messaging system, and then is unpacked, sorted and reformatted in order to be stored and catalogued into a central Hadoop server. This contribution describes in detail the Producer/Consumer architecture to convey this information from the running jobs through the messaging system to the Hadoop server.