Cargando…
CMSSW Scaling Limits on Many-Core Machines
Today the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processor...
Autor principal: | |
---|---|
Lenguaje: | eng |
Publicado: |
2023
|
Materias: | |
Acceso en línea: | http://cds.cern.ch/record/2872253 |
_version_ | 1780978594087960576 |
---|---|
author | Jones, Christopher Duncan |
author_facet | Jones, Christopher Duncan |
author_sort | Jones, Christopher Duncan |
collection | CERN |
description | Today the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processors, and 128 cores on Ampere Altra Max ARM processors. Over the course of the past decade, the CMS data processing framework, CMSSW, has been transformed from a single-threaded framework into a highly concurrent one. The first multithreaded version was brought into production by the start of the LHC Run 2 in 2015. Since then, the framework's threading efficiency has gradually been improved by adding more levels of concurrency and reducing the amount of serial code paths. The latest addition was support for concurrent Runs. In this work we review the concurrency model of the CMSSW, and measure its scalability with real CMS applications, such as simulation and reconstruction, on modern many-core machines. We show metrics such as event processing throughput and application memory usage with and without the contribution of I/O, as I/O has been the major scaling limitation for the CMS applications. |
id | cern-2872253 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2023 |
record_format | invenio |
spelling | cern-28722532023-09-25T18:53:32Zhttp://cds.cern.ch/record/2872253engJones, Christopher DuncanCMSSW Scaling Limits on Many-Core MachinesDetectors and Experimental TechniquesToday the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processors, and 128 cores on Ampere Altra Max ARM processors. Over the course of the past decade, the CMS data processing framework, CMSSW, has been transformed from a single-threaded framework into a highly concurrent one. The first multithreaded version was brought into production by the start of the LHC Run 2 in 2015. Since then, the framework's threading efficiency has gradually been improved by adding more levels of concurrency and reducing the amount of serial code paths. The latest addition was support for concurrent Runs. In this work we review the concurrency model of the CMSSW, and measure its scalability with real CMS applications, such as simulation and reconstruction, on modern many-core machines. We show metrics such as event processing throughput and application memory usage with and without the contribution of I/O, as I/O has been the major scaling limitation for the CMS applications.CMS-CR-2023-116oai:cds.cern.ch:28722532023-08-15 |
spellingShingle | Detectors and Experimental Techniques Jones, Christopher Duncan CMSSW Scaling Limits on Many-Core Machines |
title | CMSSW Scaling Limits on Many-Core Machines |
title_full | CMSSW Scaling Limits on Many-Core Machines |
title_fullStr | CMSSW Scaling Limits on Many-Core Machines |
title_full_unstemmed | CMSSW Scaling Limits on Many-Core Machines |
title_short | CMSSW Scaling Limits on Many-Core Machines |
title_sort | cmssw scaling limits on many-core machines |
topic | Detectors and Experimental Techniques |
url | http://cds.cern.ch/record/2872253 |
work_keys_str_mv | AT joneschristopherduncan cmsswscalinglimitsonmanycoremachines |