Cargando…

Applications of advanced data analysis and expert system technologies in the ATLAS Trigger-DAQ Controls framework

The Trigger and DAQ system of the ATLAS experiment is a very complex distributed computing system, composed of more than 15000 applications running on more than 2000 computers. The TDAQ Controls system has to guarantee the smooth and synchronous operations of all the TDAQ components and has to provi...

Descripción completa

Detalles Bibliográficos
Autores principales: Avolio, G, Corso Radu, A, Kazarov, A, Lehmann Miotto, G, Magnoni, L
Lenguaje:eng
Publicado: 2012
Materias:
Acceso en línea:http://cds.cern.ch/record/1458056
Descripción
Sumario:The Trigger and DAQ system of the ATLAS experiment is a very complex distributed computing system, composed of more than 15000 applications running on more than 2000 computers. The TDAQ Controls system has to guarantee the smooth and synchronous operations of all the TDAQ components and has to provide the means to minimize the downtime of the system caused by runtime failures. During data taking runs, streams of information messages sent or published by running applications are the main sources of knowledge about correctness of running operations. The huge flow of operational monitoring data produced is constantly monitored by experts in order to detect problems or misbehaviours. Given the scale of the system and the rates of data to be analyzed, the automation of the system functionality in the areas of operational monitoring, system verification, error detection and recovery is a strong requirement. To accomplish its objective, the Controls system includes some high-level components which are based on advanced software technologies, namely the rule-based Expert System and the Complex Event Processing engines. The chosen techniques allow to formalize, store and reuse the knowledge of experts and thus to assist the TDAQ shift crew to accomplish its task.