Cargando…

Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions

The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bramas, Bérenger, Kus, Pavel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2018
Materias:	Distributed and Parallel Computing
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924463/ https://www.ncbi.nlm.nih.gov/pubmed/33816805 http://dx.doi.org/10.7717/peerj-cs.151

_version_	1783659094982262784
author	Bramas, Bérenger Kus, Pavel
author_facet	Bramas, Bérenger Kus, Pavel
author_sort	Bramas, Bérenger
collection	PubMed
description	The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has been shown that block-based kernels help to achieve high performance, they are difficult to use in practice because of the zero padding they require. In the current paper, we propose new kernels using the AVX-512 instruction set, which makes it possible to use a blocking scheme without any zero padding in the matrix memory storage. We describe mask-based sparse matrix formats and their corresponding SpMV kernels highly optimized in assembly language. Considering that the optimal blocking size depends on the matrix, we also provide a method to predict the best kernel to be used utilizing a simple interpolation of results from previous executions. We compare the performance of our approach to that of the Intel MKL CSR kernel and the CSR5 open-source package on a set of standard benchmark matrices. We show that we can achieve significant improvements in many cases, both for sequential and for parallel executions. Finally, we provide the corresponding code in an open source library, called SPC5.
format	Online Article Text
id	pubmed-7924463
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-79244632021-04-02 Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions Bramas, Bérenger Kus, Pavel PeerJ Comput Sci Distributed and Parallel Computing The sparse matrix-vector product (SpMV) is a fundamental operation in many scientific applications from various fields. The High Performance Computing (HPC) community has therefore continuously invested a lot of effort to provide an efficient SpMV kernel on modern CPU architectures. Although it has been shown that block-based kernels help to achieve high performance, they are difficult to use in practice because of the zero padding they require. In the current paper, we propose new kernels using the AVX-512 instruction set, which makes it possible to use a blocking scheme without any zero padding in the matrix memory storage. We describe mask-based sparse matrix formats and their corresponding SpMV kernels highly optimized in assembly language. Considering that the optimal blocking size depends on the matrix, we also provide a method to predict the best kernel to be used utilizing a simple interpolation of results from previous executions. We compare the performance of our approach to that of the Intel MKL CSR kernel and the CSR5 open-source package on a set of standard benchmark matrices. We show that we can achieve significant improvements in many cases, both for sequential and for parallel executions. Finally, we provide the corresponding code in an open source library, called SPC5. PeerJ Inc. 2018-04-30 /pmc/articles/PMC7924463/ /pubmed/33816805 http://dx.doi.org/10.7717/peerj-cs.151 Text en ©2018 Bramas and Kus http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Distributed and Parallel Computing Bramas, Bérenger Kus, Pavel Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
title	Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
title_full	Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
title_fullStr	Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
title_full_unstemmed	Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
title_short	Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions
title_sort	computing the sparse matrix vector product using block-based kernels without zero padding on processors with avx-512 instructions
topic	Distributed and Parallel Computing
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7924463/ https://www.ncbi.nlm.nih.gov/pubmed/33816805 http://dx.doi.org/10.7717/peerj-cs.151
work_keys_str_mv	AT bramasberenger computingthesparsematrixvectorproductusingblockbasedkernelswithoutzeropaddingonprocessorswithavx512instructions AT kuspavel computingthesparsematrixvectorproductusingblockbasedkernelswithoutzeropaddingonprocessorswithavx512instructions

Computing the sparse matrix vector product using block-based kernels without zero padding on processors with AVX-512 instructions

Ejemplares similares