Cargando…

Event-Driven Messaging for Offline Data Quality Monitoring at ATLAS

During LHC Run 1, the information flow through the offline data quality monitoring in ATLAS relied heavily on chains of processes polling each other's outputs for handshaking purposes. This resulted in a fragile architecture with many possible points of failure and an inability to monitor the o...

Descripción completa

Detalles Bibliográficos
Autor principal: Onyisi, Peter
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/664/6/062045
http://cds.cern.ch/record/2016790
Descripción
Sumario:During LHC Run 1, the information flow through the offline data quality monitoring in ATLAS relied heavily on chains of processes polling each other's outputs for handshaking purposes. This resulted in a fragile architecture with many possible points of failure and an inability to monitor the overall state of the distributed system. We report on the status of a project undertaken during the LHC shutdown to replace the ad hoc synchronization methods with a uniform message queue system. This enables the use of standard protocols to connect processes on multiple hosts; reliable transmission of messages between possibly unreliable programs; easy monitoring of the information flow; and the removal of inefficient polling-based communication.