Cargando…

heFFTe: Highly Efficient FFT for Exascale

Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software, but also make it aware of the hardware architecture...

Descripción completa

Detalles Bibliográficos
Autores principales: Ayala, Alan, Tomov, Stanimire, Haidar, Azzam, Dongarra, Jack
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302276/
http://dx.doi.org/10.1007/978-3-030-50371-0_19
_version_ 1783547815380647936
author Ayala, Alan
Tomov, Stanimire
Haidar, Azzam
Dongarra, Jack
author_facet Ayala, Alan
Tomov, Stanimire
Haidar, Azzam
Dongarra, Jack
author_sort Ayala, Alan
collection PubMed
description Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software, but also make it aware of the hardware architecture and efficiently exploit its power. Currently, several and diverse applications, such as those part of the Exascale Computing Project (ECP) in the United States, rely on efficient computation of the Fast Fourier Transform (FFT). In this context, we present the design and implementation of heFFTe (Highly Efficient FFT for Exascale) library, which targets the upcoming exascale supercomputers. We provide highly (linearly) scalable GPU kernels that achieve more than [Formula: see text] speedup with respect to local kernels from CPU state-of-the-art libraries, and over [Formula: see text] speedup for the whole FFT computation. A communication model for parallel FFTs is also provided to analyze the bottleneck for large-scale problems. We show experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 24,576 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs.
format Online
Article
Text
id pubmed-7302276
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-73022762020-06-18 heFFTe: Highly Efficient FFT for Exascale Ayala, Alan Tomov, Stanimire Haidar, Azzam Dongarra, Jack Computational Science – ICCS 2020 Article Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software, but also make it aware of the hardware architecture and efficiently exploit its power. Currently, several and diverse applications, such as those part of the Exascale Computing Project (ECP) in the United States, rely on efficient computation of the Fast Fourier Transform (FFT). In this context, we present the design and implementation of heFFTe (Highly Efficient FFT for Exascale) library, which targets the upcoming exascale supercomputers. We provide highly (linearly) scalable GPU kernels that achieve more than [Formula: see text] speedup with respect to local kernels from CPU state-of-the-art libraries, and over [Formula: see text] speedup for the whole FFT computation. A communication model for parallel FFTs is also provided to analyze the bottleneck for large-scale problems. We show experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 24,576 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs. 2020-05-26 /pmc/articles/PMC7302276/ http://dx.doi.org/10.1007/978-3-030-50371-0_19 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Ayala, Alan
Tomov, Stanimire
Haidar, Azzam
Dongarra, Jack
heFFTe: Highly Efficient FFT for Exascale
title heFFTe: Highly Efficient FFT for Exascale
title_full heFFTe: Highly Efficient FFT for Exascale
title_fullStr heFFTe: Highly Efficient FFT for Exascale
title_full_unstemmed heFFTe: Highly Efficient FFT for Exascale
title_short heFFTe: Highly Efficient FFT for Exascale
title_sort heffte: highly efficient fft for exascale
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302276/
http://dx.doi.org/10.1007/978-3-030-50371-0_19
work_keys_str_mv AT ayalaalan hefftehighlyefficientfftforexascale
AT tomovstanimire hefftehighlyefficientfftforexascale
AT haidarazzam hefftehighlyefficientfftforexascale
AT dongarrajack hefftehighlyefficientfftforexascale