Cargando…

The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework

The ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data obtained at unprecedented energy and rates. The TDAQ Controls system is the component that guarantees the smoot...

Descripción completa

Detalles Bibliográficos
Autor principal: Avolio, Giuseppe
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2667392
_version_ 1780962065407541248
author Avolio, Giuseppe
author_facet Avolio, Giuseppe
author_sort Avolio, Giuseppe
collection CERN
description The ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data obtained at unprecedented energy and rates. The TDAQ Controls system is the component that guarantees the smooth and synchronous operations of all the TDAQ components and provides the means to minimize the downtime of the system caused by run-time failures. Given the scale and complexity of the TDAQ system and the rates of data to be analyzed, the automation of the system functionality in the areas of error detection and recovery is a strong requirement. That is why in Run 2 the Central Hint and Information Processor (CHIP) service has been introduced; it can be truly considered the "brain" of the TDAQ Controls system. CHIP is an intelligent system able to supervise the ATLAS data taking, take operational decisions and handle abnormal conditions. It is based on an open-source Complex Event Processing (CEP) engine, ESPER. Currently, CHIP's knowledge base is made up of more than 300 rules organized in about 30 different contexts. This paper will focus on the experience gained with CHIP during the whole LHC Run 2 period. Particular attention will be paid to demonstrate how the use of CHIP for automation and error recovery proved to be a valuable asset in optimizing the data taking efficiency, reducing operational mistakes, efficiently handling complex scenarios and improving the latency to react to abnormal situations. Additionally, the huge benefits brought by the CEP engine in terms of both flexibility and simplification of the knowledge base will be reported.
id cern-2667392
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-26673922019-09-30T06:29:59Zhttp://cds.cern.ch/record/2667392engAvolio, GiuseppeThe Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls frameworkParticle Physics - ExperimentThe ATLAS experiment at the Large Hadron Collider at CERN relies on a complex and highly distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data obtained at unprecedented energy and rates. The TDAQ Controls system is the component that guarantees the smooth and synchronous operations of all the TDAQ components and provides the means to minimize the downtime of the system caused by run-time failures. Given the scale and complexity of the TDAQ system and the rates of data to be analyzed, the automation of the system functionality in the areas of error detection and recovery is a strong requirement. That is why in Run 2 the Central Hint and Information Processor (CHIP) service has been introduced; it can be truly considered the "brain" of the TDAQ Controls system. CHIP is an intelligent system able to supervise the ATLAS data taking, take operational decisions and handle abnormal conditions. It is based on an open-source Complex Event Processing (CEP) engine, ESPER. Currently, CHIP's knowledge base is made up of more than 300 rules organized in about 30 different contexts. This paper will focus on the experience gained with CHIP during the whole LHC Run 2 period. Particular attention will be paid to demonstrate how the use of CHIP for automation and error recovery proved to be a valuable asset in optimizing the data taking efficiency, reducing operational mistakes, efficiently handling complex scenarios and improving the latency to react to abnormal situations. Additionally, the huge benefits brought by the CEP engine in terms of both flexibility and simplification of the knowledge base will be reported.ATL-DAQ-SLIDE-2019-088oai:cds.cern.ch:26673922019-03-18
spellingShingle Particle Physics - Experiment
Avolio, Giuseppe
The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework
title The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework
title_full The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework
title_fullStr The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework
title_full_unstemmed The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework
title_short The Central Hint and Information Processor system for automation, error detection and recovery in the ATLAS TDAQ Controls framework
title_sort central hint and information processor system for automation, error detection and recovery in the atlas tdaq controls framework
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2667392
work_keys_str_mv AT avoliogiuseppe thecentralhintandinformationprocessorsystemforautomationerrordetectionandrecoveryintheatlastdaqcontrolsframework
AT avoliogiuseppe centralhintandinformationprocessorsystemforautomationerrordetectionandrecoveryintheatlastdaqcontrolsframework