Cargando…

Ethernet for High-Throughput Computing at CERN

When high throughput and utilization of fabric at close-to-the-link capacity are most needed in a cluster, Ethernet is a potential candidate, rivaling traditional HPC interconnects. The distributed real-time data acquisition at particle physics experiments presents an interesting use case. This arti...

Descripción completa

Detalles Bibliográficos
Autores principales: Krawczyk, Rafal, Colombo, Tommaso, Neufeld, Niko, Pisani, Flavio, Valat, Sebastien
Lenguaje:eng
Publicado: 2022
Materias:
Acceso en línea:https://dx.doi.org/10.1109/tpds.2022.3163472
http://cds.cern.ch/record/2852832
Descripción
Sumario:When high throughput and utilization of fabric at close-to-the-link capacity are most needed in a cluster, Ethernet is a potential candidate, rivaling traditional HPC interconnects. The distributed real-time data acquisition at particle physics experiments presents an interesting use case. This article evaluates possible Ethernet-based solutions for aggregating data from hundreds of data sources at a throughput of dozens of Tb/s. This leads us to many-to-one data exchanges where we strive for a cost-optimized setup sustaining more than 80 % of the theoretical link-load. We investigate possible Ethernet-based traffic patterns to handle data acquisition on large multi-source apparatuses. Different numbers of producers and receivers and different link speeds are allowed in a large-scale network. Performance tests were conducted using customized benchmarks and evaluation test benches. The article presents tested scenarios and problems encountered in practice. We describe how our findings influenced the design of a large production system at CERN. We also present relevant general conclusions for a broader range of applications of Ethernet in HPC.