Cargando…
heFFTe: Highly Efficient FFT for Exascale
Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software, but also make it aware of the hardware architecture...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302276/ http://dx.doi.org/10.1007/978-3-030-50371-0_19 |
_version_ | 1783547815380647936 |
---|---|
author | Ayala, Alan Tomov, Stanimire Haidar, Azzam Dongarra, Jack |
author_facet | Ayala, Alan Tomov, Stanimire Haidar, Azzam Dongarra, Jack |
author_sort | Ayala, Alan |
collection | PubMed |
description | Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software, but also make it aware of the hardware architecture and efficiently exploit its power. Currently, several and diverse applications, such as those part of the Exascale Computing Project (ECP) in the United States, rely on efficient computation of the Fast Fourier Transform (FFT). In this context, we present the design and implementation of heFFTe (Highly Efficient FFT for Exascale) library, which targets the upcoming exascale supercomputers. We provide highly (linearly) scalable GPU kernels that achieve more than [Formula: see text] speedup with respect to local kernels from CPU state-of-the-art libraries, and over [Formula: see text] speedup for the whole FFT computation. A communication model for parallel FFTs is also provided to analyze the bottleneck for large-scale problems. We show experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 24,576 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs. |
format | Online Article Text |
id | pubmed-7302276 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-73022762020-06-18 heFFTe: Highly Efficient FFT for Exascale Ayala, Alan Tomov, Stanimire Haidar, Azzam Dongarra, Jack Computational Science – ICCS 2020 Article Exascale computing aspires to meet the increasing demands from large scientific applications. Software targeting exascale is typically designed for heterogeneous architectures; henceforth, it is not only important to develop well-designed software, but also make it aware of the hardware architecture and efficiently exploit its power. Currently, several and diverse applications, such as those part of the Exascale Computing Project (ECP) in the United States, rely on efficient computation of the Fast Fourier Transform (FFT). In this context, we present the design and implementation of heFFTe (Highly Efficient FFT for Exascale) library, which targets the upcoming exascale supercomputers. We provide highly (linearly) scalable GPU kernels that achieve more than [Formula: see text] speedup with respect to local kernels from CPU state-of-the-art libraries, and over [Formula: see text] speedup for the whole FFT computation. A communication model for parallel FFTs is also provided to analyze the bottleneck for large-scale problems. We show experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 24,576 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs. 2020-05-26 /pmc/articles/PMC7302276/ http://dx.doi.org/10.1007/978-3-030-50371-0_19 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Ayala, Alan Tomov, Stanimire Haidar, Azzam Dongarra, Jack heFFTe: Highly Efficient FFT for Exascale |
title | heFFTe: Highly Efficient FFT for Exascale |
title_full | heFFTe: Highly Efficient FFT for Exascale |
title_fullStr | heFFTe: Highly Efficient FFT for Exascale |
title_full_unstemmed | heFFTe: Highly Efficient FFT for Exascale |
title_short | heFFTe: Highly Efficient FFT for Exascale |
title_sort | heffte: highly efficient fft for exascale |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7302276/ http://dx.doi.org/10.1007/978-3-030-50371-0_19 |
work_keys_str_mv | AT ayalaalan hefftehighlyefficientfftforexascale AT tomovstanimire hefftehighlyefficientfftforexascale AT haidarazzam hefftehighlyefficientfftforexascale AT dongarrajack hefftehighlyefficientfftforexascale |