Cargando…

EXPERIENCE WITH SPLUNK FOR ARCHIVING AND VISUALISATION OF OPERATIONAL DATA IN ATLAS TDAQ SYSTEM

The ATLAS Trigger and Data Acquisition (TDAQ) is a large, distributed system composed of several thousands interconnected computers and tens of thousands software processes (applications). Applications produce a large amount of operational messages at the order of 10$^{4}$ messages per second, which...

Descripción completa

Detalles Bibliográficos
Autores principales: Kazarov, Andrei, Mineev, Mikhail, Avolio, Giuseppe, Chitan, Adrian
Lenguaje:eng
Publicado: 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1088/1742-6596/1085/3/032052
http://cds.cern.ch/record/2290023
Descripción
Sumario:The ATLAS Trigger and Data Acquisition (TDAQ) is a large, distributed system composed of several thousands interconnected computers and tens of thousands software processes (applications). Applications produce a large amount of operational messages at the order of 10$^{4}$ messages per second, which need to be reliably stored and delivered to TDAQ operators in a quasi real-time manner, and also be available for post-mortem analysis by experts. We have selected SPLUNK, a commercial solution by Splunk Inc, as an all-in-one solution for storing different types of operational data in an indexed database, and a web-based framework for searching and presenting the indexed data and for rapid development of user-oriented dashboards accessible in a web browser. The paper describes capabilities of the Splunk framework, use cases, applications and web dashboards developed for facilitating the browsing and searching of TDAQ operational data by TDAQ operators and experts.