Cargando…
DGEMM Using Tensor Cores, and Its Accurate and Reproducible Versions
This paper proposes a method for implementing dense matrix multiplication on FP64 (DGEMM) and FP32 (SGEMM) using Tensor Cores on NVIDIA’s graphics processing units (GPUs). Tensor Cores are special processing units that perform [Formula: see text] matrix multiplications on FP16 inputs with FP32 preci...
Autores principales: | Mukunoki, Daichi, Ozaki, Katsuhisa, Ogita, Takeshi, Imamura, Toshiyuki |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295351/ http://dx.doi.org/10.1007/978-3-030-50743-5_12 |
Ejemplares similares
-
Inter-centre reproducibility of cardiac diffusion tensor measures
por: Tunnicliffe, Elizabeth M, et al.
Publicado: (2014) -
Numerical behavior of NVIDIA tensor cores
por: Fasi, Massimiliano, et al.
Publicado: (2021) -
Accurate and reproducible diagnosis of peanut allergy using epitope mapping
por: Suárez‐Fariñas, Mayte, et al.
Publicado: (2021) -
Reproducibility of in-vivo diffusion tensor cardiovascular magnetic resonance in hypertrophic cardiomyopathy
por: McGill, Laura-Ann, et al.
Publicado: (2012) -
Reproducibility of the Structural Brain Connectome Derived from Diffusion Tensor Imaging
por: Bonilha, Leonardo, et al.
Publicado: (2015)