Cargando…

Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction

The management of separate memory spaces of CPUs and GPUs brings an additional burden to the development of software for GPUs. To help with this, CUDA unified memory provides a single address space that can be accessed from both CPU and GPU. The automatic data transfer mechanism is based on page fau...

Descripción completa

Detalles Bibliográficos
Autores principales: Kortelainen, Matti J, Kwok, Martin
Lenguaje:eng
Publicado: 2021
Materias:
Acceso en línea:https://dx.doi.org/10.1051/epjconf/202125103035
http://cds.cern.ch/record/2813818
_version_ 1780973424387031040
author Kortelainen, Matti J
Kwok, Martin
author_facet Kortelainen, Matti J
Kwok, Martin
author_sort Kortelainen, Matti J
collection CERN
description The management of separate memory spaces of CPUs and GPUs brings an additional burden to the development of software for GPUs. To help with this, CUDA unified memory provides a single address space that can be accessed from both CPU and GPU. The automatic data transfer mechanism is based on page faults generated by the memory accesses. This mechanism has a performance cost, that can be with explicit memory prefetch requests. Various hints on the inteded usage of the memory regions can also be given to further improve the performance. The overall effect of unified memory compared to an explicit memory management can depend heavily on the application. In this paper we evaluate the performance impact of CUDA unified memory using the heterogeneous pixel reconstruction code from the CMS experiment as a realistic use case of a GPU-targeting HEP reconstruction software. We also compare the programming model using CUDA unified memory to the explicit management of separate CPU and GPU memory spaces.
id cern-2813818
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2021
record_format invenio
spelling cern-28138182022-11-17T14:30:12Zdoi:10.1051/epjconf/202125103035http://cds.cern.ch/record/2813818engKortelainen, Matti JKwok, MartinPerformance of CUDA Unified Memory in CMS Heterogeneous Pixel ReconstructionComputing and ComputersThe management of separate memory spaces of CPUs and GPUs brings an additional burden to the development of software for GPUs. To help with this, CUDA unified memory provides a single address space that can be accessed from both CPU and GPU. The automatic data transfer mechanism is based on page faults generated by the memory accesses. This mechanism has a performance cost, that can be with explicit memory prefetch requests. Various hints on the inteded usage of the memory regions can also be given to further improve the performance. The overall effect of unified memory compared to an explicit memory management can depend heavily on the application. In this paper we evaluate the performance impact of CUDA unified memory using the heterogeneous pixel reconstruction code from the CMS experiment as a realistic use case of a GPU-targeting HEP reconstruction software. We also compare the programming model using CUDA unified memory to the explicit management of separate CPU and GPU memory spaces.FERMILAB-CONF-21-064-SCDoai:cds.cern.ch:28138182021
spellingShingle Computing and Computers
Kortelainen, Matti J
Kwok, Martin
Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction
title Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction
title_full Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction
title_fullStr Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction
title_full_unstemmed Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction
title_short Performance of CUDA Unified Memory in CMS Heterogeneous Pixel Reconstruction
title_sort performance of cuda unified memory in cms heterogeneous pixel reconstruction
topic Computing and Computers
url https://dx.doi.org/10.1051/epjconf/202125103035
http://cds.cern.ch/record/2813818
work_keys_str_mv AT kortelainenmattij performanceofcudaunifiedmemoryincmsheterogeneouspixelreconstruction
AT kwokmartin performanceofcudaunifiedmemoryincmsheterogeneouspixelreconstruction