Cargando…

Modeling a Large Data Acquisition Network in a Simulation Framework

The ATLAS detector at CERN records particle collision “events” delivered by the Large Hadron Collider. Its data-acquisition system is a distributed software system that identifies, selects, and stores interesting events in near real-time, with an aggregate throughput of several 10 GB/s. It is a dist...

Descripción completa

Detalles Bibliográficos
Autor principal: Colombo, Tommaso
Lenguaje:eng
Publicado: 2015
Materias:
Acceso en línea:http://cds.cern.ch/record/2051495
Descripción
Sumario:The ATLAS detector at CERN records particle collision “events” delivered by the Large Hadron Collider. Its data-acquisition system is a distributed software system that identifies, selects, and stores interesting events in near real-time, with an aggregate throughput of several 10 GB/s. It is a distributed software system executed on a farm of roughly 2000 commodity worker nodes communicating via TCP/IP on an Ethernet network. Event data fragments are received from the many detector readout channels and are buffered, collected together, analyzed and either stored permanently or discarded. This system, and data-acquisition systems in general, are sensitive to the latency of the data transfer from the readout buffers to the worker nodes. Challenges affecting this transfer include the many-to-one communication pattern and the inherently bursty nature of the traffic. In this paper we introduce the main performance issues brought about by this workload, focusing in particular on the so-called TCP incast pathology. Since performing systematic studies of these issues is often impeded by operational constraints related to the mission-critical nature of these systems, we focus instead on the development of a simulation model of the ATLAS data-acquisition system, used as a case study. The simulation is based on the well-established the OMNeT++ framework. Its results are compared with existing measurements of the system's behavior. The successful reproduction of the measurements by the simulations validates the modeling approach. We share some of the preliminary findings obtained from the simulation, as an example of the additional possibilities it enables, and outline the planned future investigations.