Cargando…

Recent updates of the Control and Configuration of the ATLAS Trigger and Data Acquisition System

The ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition (TDAQ) system \cite{tdaq:tdr} to gather and select particle collision data at unprecedented energy and rates. The Control and Configuration (CC) system is responsible for...

Descripción completa

Detalles Bibliográficos
Autor principal: Bianchi, R M
Lenguaje:eng
Publicado: 2011
Materias:
Acceso en línea:http://cds.cern.ch/record/1394278
Descripción
Sumario:The ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition (TDAQ) system \cite{tdaq:tdr} to gather and select particle collision data at unprecedented energy and rates. The Control and Configuration (CC) system is responsible for all the software required to configure and control the ATLAS data taking. This ranges from high level applications, such as the graphical user interfaces and the desktops used within the ATLAS control room, to low level packages, such as access, process and resource management. Currently the CC system is required to supervise more than 30000 processes running on more than 2000 computers. At these scales, issues such as access, process and resource management, distribution of configuration data and access to them, run control, diagnostic and especially error recovery become predominant to guarantee a high availability of the TDAQ system and minimize the dead time of the experiment. And it is indeed during the data taking activities that the CC system has shown its strength and maturity, featuring a great scalability against the always increasing number of software processes in the TDAQ system and implementing several automatic error recovery procedures in complex and sophisticated scenarios. This paper gives an overview of the new functionalities and recent upgrades of several CC system components, with special emphasis on speed and reliability improvements and on optimization of the user experience during operations.