Cargando…

Vectorization of CMSSW offline software

Vectorization in CMSSW applicationsThe CMS experiment has been utilizing vectorization, or SIMD, in parts of its data processing applications for over a decade. On x86 platforms the vectorization level is still SSE3. In the past attempts to use wider vector instruction sets such as AVX or AVX-512 ha...

Descripción completa

Detalles Bibliográficos
Autor principal: Gartung, Patrick Elmo
Lenguaje:eng
Publicado: 2023
Materias:
Acceso en línea:http://cds.cern.ch/record/2872256
_version_ 1780978594516828160
author Gartung, Patrick Elmo
author_facet Gartung, Patrick Elmo
author_sort Gartung, Patrick Elmo
collection CERN
description Vectorization in CMSSW applicationsThe CMS experiment has been utilizing vectorization, or SIMD, in parts of its data processing applications for over a decade. On x86 platforms the vectorization level is still SSE3. In the past attempts to use wider vector instruction sets such as AVX or AVX-512 have, in practice, not resulted in improvements in the overall event processing throughput, because the CPUs scale down their frequency when processing AVX instructions. In addition, a notable part of the global pool of CMS resources has been old systems either not supporting AVX, or where the CPU frequency downscaling impacts all cores of the CPU. CMS has nevertheless continued to vectorize more of its application code, and in this work we review profiling methods we have found effective to find out pieces of code that would benefit from vectorization, and techniques to transform those codes such that the GCC compiler is able to auto-vectorize those codes. The build system used for CMSSW, Scram, has also been enhanced to be able to build code for multiple CPU microarchitectures such that the shared libraries of desired microarchitecture level can be loaded based on the CPU of the system. This multi-microarchitecture setup is invisible to the workflow management system, which makes its deployment straightforward. We describe in detail how this multi-microarchitecture build is set up, and measure the impact of using wider vector units than SSE3 on the event processing throughput of CMS applications such as simulation and reconstruction on recent x86 CPUs.
id cern-2872256
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2023
record_format invenio
spelling cern-28722562023-09-25T18:53:32Zhttp://cds.cern.ch/record/2872256engGartung, Patrick ElmoVectorization of CMSSW offline softwareDetectors and Experimental TechniquesVectorization in CMSSW applicationsThe CMS experiment has been utilizing vectorization, or SIMD, in parts of its data processing applications for over a decade. On x86 platforms the vectorization level is still SSE3. In the past attempts to use wider vector instruction sets such as AVX or AVX-512 have, in practice, not resulted in improvements in the overall event processing throughput, because the CPUs scale down their frequency when processing AVX instructions. In addition, a notable part of the global pool of CMS resources has been old systems either not supporting AVX, or where the CPU frequency downscaling impacts all cores of the CPU. CMS has nevertheless continued to vectorize more of its application code, and in this work we review profiling methods we have found effective to find out pieces of code that would benefit from vectorization, and techniques to transform those codes such that the GCC compiler is able to auto-vectorize those codes. The build system used for CMSSW, Scram, has also been enhanced to be able to build code for multiple CPU microarchitectures such that the shared libraries of desired microarchitecture level can be loaded based on the CPU of the system. This multi-microarchitecture setup is invisible to the workflow management system, which makes its deployment straightforward. We describe in detail how this multi-microarchitecture build is set up, and measure the impact of using wider vector units than SSE3 on the event processing throughput of CMS applications such as simulation and reconstruction on recent x86 CPUs.CMS-CR-2023-115oai:cds.cern.ch:28722562023-08-15
spellingShingle Detectors and Experimental Techniques
Gartung, Patrick Elmo
Vectorization of CMSSW offline software
title Vectorization of CMSSW offline software
title_full Vectorization of CMSSW offline software
title_fullStr Vectorization of CMSSW offline software
title_full_unstemmed Vectorization of CMSSW offline software
title_short Vectorization of CMSSW offline software
title_sort vectorization of cmssw offline software
topic Detectors and Experimental Techniques
url http://cds.cern.ch/record/2872256
work_keys_str_mv AT gartungpatrickelmo vectorizationofcmsswofflinesoftware