Cargando…

Event-driven RDMA network communication in the ATLAS DAQ system with NetIO

Event-driven RDMA network communication in the ATLAS DAQ system with NetIO NetIO is a network communication library that enables distributed applications to exchange messages using high-level communication patterns such as publish/subscribe. NetIO is based on libfabric and supports various types of...

Descripción completa

Detalles Bibliográficos
Autor principal: Schumacher, Jorn
Lenguaje:eng
Publicado: 2019
Materias:
Acceso en línea:http://cds.cern.ch/record/2701666
_version_ 1780964525036535808
author Schumacher, Jorn
author_facet Schumacher, Jorn
author_sort Schumacher, Jorn
collection CERN
description Event-driven RDMA network communication in the ATLAS DAQ system with NetIO NetIO is a network communication library that enables distributed applications to exchange messages using high-level communication patterns such as publish/subscribe. NetIO is based on libfabric and supports various types of RDMA networks, for example Infiniband, RoCE, or OmniPath. NetIO is currently being used in the data acquisition chain of the ATLAS experiment Major parts of NetIO were recently rewritten using a novel, event-driven approach. All actions are processed asynchronously by a single-threaded central event loop. The event loop is backed by the Linux epoll system. The event-driven design implies that software written with NetIO uses callbacks to react to events. The motivation for the architectural modifications to NetIO was to improve processing efficiency. Initial benchmarks show that the updated NetIO implementation yields the same or higher throughput, while the CPU resource utilization is reduced by an order of magnitude. The cause for this efficiency gain is largely due to significantly reduced thread synchronization, that became obsolete in the event driven approach. The paper will show this architecture is very suitable for IO-heavy workloads that are typically found in DAQ systems of High-Energy Physics experiments. The event-driven architecture will be explained in detail and compared with the original NetIO. The challenges of writing event-driven code are identified. A performance study of the event-driven NetIO in comparison with the original implementation as well as other RDMA networking solutions will be given.
id cern-2701666
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2019
record_format invenio
spelling cern-27016662019-11-15T22:12:53Zhttp://cds.cern.ch/record/2701666engSchumacher, JornEvent-driven RDMA network communication in the ATLAS DAQ system with NetIOParticle Physics - ExperimentEvent-driven RDMA network communication in the ATLAS DAQ system with NetIO NetIO is a network communication library that enables distributed applications to exchange messages using high-level communication patterns such as publish/subscribe. NetIO is based on libfabric and supports various types of RDMA networks, for example Infiniband, RoCE, or OmniPath. NetIO is currently being used in the data acquisition chain of the ATLAS experiment Major parts of NetIO were recently rewritten using a novel, event-driven approach. All actions are processed asynchronously by a single-threaded central event loop. The event loop is backed by the Linux epoll system. The event-driven design implies that software written with NetIO uses callbacks to react to events. The motivation for the architectural modifications to NetIO was to improve processing efficiency. Initial benchmarks show that the updated NetIO implementation yields the same or higher throughput, while the CPU resource utilization is reduced by an order of magnitude. The cause for this efficiency gain is largely due to significantly reduced thread synchronization, that became obsolete in the event driven approach. The paper will show this architecture is very suitable for IO-heavy workloads that are typically found in DAQ systems of High-Energy Physics experiments. The event-driven architecture will be explained in detail and compared with the original NetIO. The challenges of writing event-driven code are identified. A performance study of the event-driven NetIO in comparison with the original implementation as well as other RDMA networking solutions will be given.ATL-DAQ-SLIDE-2019-848oai:cds.cern.ch:27016662019-11-15
spellingShingle Particle Physics - Experiment
Schumacher, Jorn
Event-driven RDMA network communication in the ATLAS DAQ system with NetIO
title Event-driven RDMA network communication in the ATLAS DAQ system with NetIO
title_full Event-driven RDMA network communication in the ATLAS DAQ system with NetIO
title_fullStr Event-driven RDMA network communication in the ATLAS DAQ system with NetIO
title_full_unstemmed Event-driven RDMA network communication in the ATLAS DAQ system with NetIO
title_short Event-driven RDMA network communication in the ATLAS DAQ system with NetIO
title_sort event-driven rdma network communication in the atlas daq system with netio
topic Particle Physics - Experiment
url http://cds.cern.ch/record/2701666
work_keys_str_mv AT schumacherjorn eventdrivenrdmanetworkcommunicationintheatlasdaqsystemwithnetio