Cargando…

Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case

Abstract The ATLAS detector at CERN records proton-proton collisions delivered by the Large Hadron Collider (LHC). The ATLAS Trigger and Data-Acquisition (TDAQ) system identifies, selects, and stores interesting collision data. These are received from the detector readout electronics at an average r...

Descripción completa

Detalles Bibliográficos
Autor principal:	Colombo, T
Lenguaje:	eng
Publicado:	2014
Materias:	Particle Physics - Experiment
Acceso en línea:	https://dx.doi.org/10.1088/1742-6596/608/1/012005 http://cds.cern.ch/record/1950369

_version_	1780944209648287744
author	Colombo, T
author_facet	Colombo, T
author_sort	Colombo, T
collection	CERN
description	Abstract The ATLAS detector at CERN records proton-proton collisions delivered by the Large Hadron Collider (LHC). The ATLAS Trigger and Data-Acquisition (TDAQ) system identifies, selects, and stores interesting collision data. These are received from the detector readout electronics at an average rate of 100 kHz. The typical event data size is 1 to 2 MB. Overall, the ATLAS TDAQ can be seen as a distributed software system executed on a farm of roughly 2000 commodity PCs. The worker nodes are interconnected by an Ethernet network that at the restart of the LHC in 2015 is expected to experience a sustained throughput of several 10 GB/s. Abstract A particular type of challenge posed by this system, and by DAQ systems in general, is the inherently bursty nature of the data traffic from the readout buffers to the worker nodes. This can cause instantaneous network congestion and therefore performance degradation. The effect is particularly pronounced for unreliable network interconnections, such as Ethernet. Abstract In this paper we report on the design of the data-flow software for the 2015-2018 data-taking period of the ATLAS experiment. This software will be responsible for transporting the data across the distributed Data-Acquisition system. We will focus on the strategies employed to manage the network congestion and therefore minimize the data-collection latency and maximize the system performance. We will discuss the results of systematic measurements performed on different types of networking hardware. These results highlight the causes of network congestion and the effects on the overall system performance.
id	cern-1950369
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2014
record_format	invenio
spelling	cern-19503692019-09-30T06:29:59Zdoi:10.1088/1742-6596/608/1/012005http://cds.cern.ch/record/1950369engColombo, TData-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition CaseParticle Physics - ExperimentAbstract The ATLAS detector at CERN records proton-proton collisions delivered by the Large Hadron Collider (LHC). The ATLAS Trigger and Data-Acquisition (TDAQ) system identifies, selects, and stores interesting collision data. These are received from the detector readout electronics at an average rate of 100 kHz. The typical event data size is 1 to 2 MB. Overall, the ATLAS TDAQ can be seen as a distributed software system executed on a farm of roughly 2000 commodity PCs. The worker nodes are interconnected by an Ethernet network that at the restart of the LHC in 2015 is expected to experience a sustained throughput of several 10 GB/s. Abstract A particular type of challenge posed by this system, and by DAQ systems in general, is the inherently bursty nature of the data traffic from the readout buffers to the worker nodes. This can cause instantaneous network congestion and therefore performance degradation. The effect is particularly pronounced for unreliable network interconnections, such as Ethernet. Abstract In this paper we report on the design of the data-flow software for the 2015-2018 data-taking period of the ATLAS experiment. This software will be responsible for transporting the data across the distributed Data-Acquisition system. We will focus on the strategies employed to manage the network congestion and therefore minimize the data-collection latency and maximize the system performance. We will discuss the results of systematic measurements performed on different types of networking hardware. These results highlight the causes of network congestion and the effects on the overall system performance.ATL-DAQ-PROC-2014-029oai:cds.cern.ch:19503692014-09-25
spellingShingle	Particle Physics - Experiment Colombo, T Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case
title	Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case
title_full	Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case
title_fullStr	Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case
title_full_unstemmed	Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case
title_short	Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case
title_sort	data-flow performance optimisation on unreliable networks: the atlas data-acquisition case
topic	Particle Physics - Experiment
url	https://dx.doi.org/10.1088/1742-6596/608/1/012005 http://cds.cern.ch/record/1950369
work_keys_str_mv	AT colombot dataflowperformanceoptimisationonunreliablenetworkstheatlasdataacquisitioncase

Data-flow Performance Optimisation on Unreliable Networks: the ATLAS Data-Acquisition Case

Ejemplares similares