Cargando…

Improvements in the LHCb DAQ

The LHCb Data Acquisition system consists of about 300 FPGA-powered data-sources connected to a large farm of about 1500 ×86-servers. The connection is made by a Ethernet Local Area Network with more than 3000 ports. The very simple, connection-less, push-protocol for event-building employed by LHCb...

Descripción completa

Detalles Bibliográficos
Autores principales: Campora, Daniel, Neufeld, Niko, Schwemmer, Rainer
Lenguaje:eng
Publicado: 2014
Materias:
Acceso en línea:https://dx.doi.org/10.1109/RTC.2014.7097512
http://cds.cern.ch/record/2198346
_version_ 1780951250235293696
author Campora, Daniel
Neufeld, Niko
Schwemmer, Rainer
author_facet Campora, Daniel
Neufeld, Niko
Schwemmer, Rainer
author_sort Campora, Daniel
collection CERN
description The LHCb Data Acquisition system consists of about 300 FPGA-powered data-sources connected to a large farm of about 1500 ×86-servers. The connection is made by a Ethernet Local Area Network with more than 3000 ports. The very simple, connection-less, push-protocol for event-building employed by LHCb relies critically on extremely low loss-rates in the network. Since the last presentation of this system at the 2010 RealTime conference, the redundancy of the system has been significantly improved and it has also grown in size. The redundancy has increased the complexity of the network, but we managed to hide this from the event-builder “applications” on the FPGA and the individual CPU nodes. This setup and challenges with it will be described in this paper. Ageing network hardware cannot always be replaced identically, because maintenance of old network devices becomes very expensive. We have begun a campaign to identify replacement devices and will describe our procedure and measurement results. One specificity of the LHCb data acquisition system, which distinguishes it from other LHC DAQs is the use of the Timing and Fast Control (TFC) system, which is LHCb's variant of the LHC-wide Timing and Trigger Control (TTC), is used for event-management. The TFC is a hard realtime system, which needs to collaborate with a variable latency network for the purpose of event management. Together with our unreliable event-building protocol this makes the overall system sensitive to latency distributions on the network, which can lead to occasional problems in the network. To better understand these effects we have done measurements of the timing structure on the network in real time using independent FPGA-based network probes. These rather challenging measurements on a live 500 Gbit/s network will be used to improve the system for the next LHC run. Another change is the addition of disks to the event-receiving nodes, which used to do purely transient pro- essing. This has increased the “elasticity” of the system, at the expense of increased operational complexity. We will discuss performance and reliability issues.
id oai-inspirehep.net-1367440
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2014
record_format invenio
spelling oai-inspirehep.net-13674402019-09-30T06:29:59Zdoi:10.1109/RTC.2014.7097512http://cds.cern.ch/record/2198346engCampora, DanielNeufeld, NikoSchwemmer, RainerImprovements in the LHCb DAQDetectors and Experimental TechniquesComputing and ComputersThe LHCb Data Acquisition system consists of about 300 FPGA-powered data-sources connected to a large farm of about 1500 ×86-servers. The connection is made by a Ethernet Local Area Network with more than 3000 ports. The very simple, connection-less, push-protocol for event-building employed by LHCb relies critically on extremely low loss-rates in the network. Since the last presentation of this system at the 2010 RealTime conference, the redundancy of the system has been significantly improved and it has also grown in size. The redundancy has increased the complexity of the network, but we managed to hide this from the event-builder “applications” on the FPGA and the individual CPU nodes. This setup and challenges with it will be described in this paper. Ageing network hardware cannot always be replaced identically, because maintenance of old network devices becomes very expensive. We have begun a campaign to identify replacement devices and will describe our procedure and measurement results. One specificity of the LHCb data acquisition system, which distinguishes it from other LHC DAQs is the use of the Timing and Fast Control (TFC) system, which is LHCb's variant of the LHC-wide Timing and Trigger Control (TTC), is used for event-management. The TFC is a hard realtime system, which needs to collaborate with a variable latency network for the purpose of event management. Together with our unreliable event-building protocol this makes the overall system sensitive to latency distributions on the network, which can lead to occasional problems in the network. To better understand these effects we have done measurements of the timing structure on the network in real time using independent FPGA-based network probes. These rather challenging measurements on a live 500 Gbit/s network will be used to improve the system for the next LHC run. Another change is the addition of disks to the event-receiving nodes, which used to do purely transient pro- essing. This has increased the “elasticity” of the system, at the expense of increased operational complexity. We will discuss performance and reliability issues.oai:inspirehep.net:13674402014
spellingShingle Detectors and Experimental Techniques
Computing and Computers
Campora, Daniel
Neufeld, Niko
Schwemmer, Rainer
Improvements in the LHCb DAQ
title Improvements in the LHCb DAQ
title_full Improvements in the LHCb DAQ
title_fullStr Improvements in the LHCb DAQ
title_full_unstemmed Improvements in the LHCb DAQ
title_short Improvements in the LHCb DAQ
title_sort improvements in the lhcb daq
topic Detectors and Experimental Techniques
Computing and Computers
url https://dx.doi.org/10.1109/RTC.2014.7097512
http://cds.cern.ch/record/2198346
work_keys_str_mv AT camporadaniel improvementsinthelhcbdaq
AT neufeldniko improvementsinthelhcbdaq
AT schwemmerrainer improvementsinthelhcbdaq