Cargando…

Small SIMD matrices for CERN high throughput computing

System tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator us...

Descripción completa

Detalles Bibliográficos
Autores principales: Lemaitre, Florian, Couturier, Benjamin, Lacassagne, Lionel
Lenguaje:eng
Publicado: 2018
Materias:
Acceso en línea:https://dx.doi.org/10.1145/3178433.3178434
http://cds.cern.ch/record/2632924
_version_ 1780959626305470464
author Lemaitre, Florian
Couturier, Benjamin
Lacassagne, Lionel
author_facet Lemaitre, Florian
Couturier, Benjamin
Lacassagne, Lionel
author_sort Lemaitre, Florian
collection CERN
description System tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator used to speed up Cholesky Factorization and Kalman Filter for small matrices. The generator is easy to use and produces portable and heavily optimized code. We focus on current SIMD architectures (SSE, AVX, AVX512, Neon, SVE, Altivec and VSX). Our Cholesky factorization outperforms any existing libraries: from x3 to x10 faster than MKL. The Kalman Filter is also faster than existing implementations, and achieves $4 \cdot 10^9$ iter/s on a 2x24C Intel Xeon.
id oai-inspirehep.net-1670546
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2018
record_format invenio
spelling oai-inspirehep.net-16705462022-06-30T22:23:23Zdoi:10.1145/3178433.3178434http://cds.cern.ch/record/2632924engLemaitre, FlorianCouturier, BenjaminLacassagne, LionelSmall SIMD matrices for CERN high throughput computingComputing and ComputersSystem tracking is an old problem and has been heavily optimized throughout the past. However, in High Energy Physics, many small systems are tracked in real-time using Kalman filtering and no implementation satisfying those constraints currently exists. In this paper, we present a code generator used to speed up Cholesky Factorization and Kalman Filter for small matrices. The generator is easy to use and produces portable and heavily optimized code. We focus on current SIMD architectures (SSE, AVX, AVX512, Neon, SVE, Altivec and VSX). Our Cholesky factorization outperforms any existing libraries: from x3 to x10 faster than MKL. The Kalman Filter is also faster than existing implementations, and achieves $4 \cdot 10^9$ iter/s on a 2x24C Intel Xeon.oai:inspirehep.net:16705462018
spellingShingle Computing and Computers
Lemaitre, Florian
Couturier, Benjamin
Lacassagne, Lionel
Small SIMD matrices for CERN high throughput computing
title Small SIMD matrices for CERN high throughput computing
title_full Small SIMD matrices for CERN high throughput computing
title_fullStr Small SIMD matrices for CERN high throughput computing
title_full_unstemmed Small SIMD matrices for CERN high throughput computing
title_short Small SIMD matrices for CERN high throughput computing
title_sort small simd matrices for cern high throughput computing
topic Computing and Computers
url https://dx.doi.org/10.1145/3178433.3178434
http://cds.cern.ch/record/2632924
work_keys_str_mv AT lemaitreflorian smallsimdmatricesforcernhighthroughputcomputing
AT couturierbenjamin smallsimdmatricesforcernhighthroughputcomputing
AT lacassagnelionel smallsimdmatricesforcernhighthroughputcomputing