Cargando…
Small SIMD matrices for CERN high throughput computing
System tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator us...
Autores principales: | , , |
---|---|
Lenguaje: | eng |
Publicado: |
2018
|
Materias: | |
Acceso en línea: | https://dx.doi.org/10.1145/3178433.3178434 http://cds.cern.ch/record/2632924 |
_version_ | 1780959626305470464 |
---|---|
author | Lemaitre, Florian Couturier, Benjamin Lacassagne, Lionel |
author_facet | Lemaitre, Florian Couturier, Benjamin Lacassagne, Lionel |
author_sort | Lemaitre, Florian |
collection | CERN |
description | System tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator used to speed up Cholesky Factorization and Kalman Filter for small matrices. The generator is easy to use and produces portable and heavily optimized code. We focus on current SIMD architectures (SSE, AVX, AVX512, Neon, SVE, Altivec and VSX). Our Cholesky factorization outperforms any existing libraries: from x3 to x10 faster than MKL. The Kalman Filter is also faster than existing implementations, and achieves $4 \cdot 10^9$ iter/s on a 2x24C Intel Xeon. |
id | oai-inspirehep.net-1670546 |
institution | Organización Europea para la Investigación Nuclear |
language | eng |
publishDate | 2018 |
record_format | invenio |
spelling | oai-inspirehep.net-16705462022-06-30T22:23:23Zdoi:10.1145/3178433.3178434http://cds.cern.ch/record/2632924engLemaitre, FlorianCouturier, BenjaminLacassagne, LionelSmall SIMD matrices for CERN high throughput computingComputing and ComputersSystem tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator used to speed up Cholesky Factorization and Kalman Filter for small matrices. The generator is easy to use and produces portable and heavily optimized code. We focus on current SIMD architectures (SSE, AVX, AVX512, Neon, SVE, Altivec and VSX). Our Cholesky factorization outperforms any existing libraries: from x3 to x10 faster than MKL. The Kalman Filter is also faster than existing implementations, and achieves $4 \cdot 10^9$ iter/s on a 2x24C Intel Xeon.oai:inspirehep.net:16705462018 |
spellingShingle | Computing and Computers Lemaitre, Florian Couturier, Benjamin Lacassagne, Lionel Small SIMD matrices for CERN high throughput computing |
title | Small SIMD matrices for CERN high throughput computing |
title_full | Small SIMD matrices for CERN high throughput computing |
title_fullStr | Small SIMD matrices for CERN high throughput computing |
title_full_unstemmed | Small SIMD matrices for CERN high throughput computing |
title_short | Small SIMD matrices for CERN high throughput computing |
title_sort | small simd matrices for cern high throughput computing |
topic | Computing and Computers |
url | https://dx.doi.org/10.1145/3178433.3178434 http://cds.cern.ch/record/2632924 |
work_keys_str_mv | AT lemaitreflorian smallsimdmatricesforcernhighthroughputcomputing AT couturierbenjamin smallsimdmatricesforcernhighthroughputcomputing AT lacassagnelionel smallsimdmatricesforcernhighthroughputcomputing |