Cargando…

Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures

A desired trend within high energy physics is to increase particle accelerator luminosities, leading to production of more collision data and higher probabilities of finding interesting physics results. A central data analysis technique used to determine whether results are interesting or not is the...

Descripción completa

Detalles Bibliográficos
Autor principal:	Lindal, Yngve Sneen
Lenguaje:	eng
Publicado:	Norwegian U. Sci. Tech. 2011
Materias:	Computing and Computers
Acceso en línea:	http://cds.cern.ch/record/1397891

_version_	1780923543782948864
author	Lindal, Yngve Sneen
author_facet	Lindal, Yngve Sneen
author_sort	Lindal, Yngve Sneen
collection	CERN
description	A desired trend within high energy physics is to increase particle accelerator luminosities, leading to production of more collision data and higher probabilities of finding interesting physics results. A central data analysis technique used to determine whether results are interesting or not is the maximum likelihood method, and the corresponding evaluation of the negative log-likelihood, which can be computationally expensive. As the amount of data grows, it is important to take benefit from the parallelism in modern computers. This, in essence, means to exploit vector registers and all available cores on CPUs, as well as utilizing co-processors as GPUs. This thesis describes the work done to optimize and parallelize a prototype of a central data analysis tool within the high energy physics community. The work consists of optimizations for multicore processors, GPUs, as well as a mechanism to balance the load between both CPUs and GPUs with the aim to fully exploit the power of modern commodity computers. We explore the OpenCL standard thoroughly and we give an overview of its limitations when used in a large real-world software package. We reach a single-core speedup of ∼ 7.8x compared to the original implementation of the toolkit for the physical model we use throughout this thesis. On top of that follows an increase of ∼ 3.6x with 4 threads on a commodity Intel processor, as well as almost perfect scalability on NUMA systems when thread affinity is applied. GPUs give varying speedups depending on the complexity of the physics model used. With our model, price-comparable GPUs give a speedup of ∼ 2.5x compared to a modern Intel CPU utilizing 8 SMT threads. The balancing mechanism is based on real timings of each device and works optimally for large workloads when the API calls to the OpenCL implementation impose a small overhead and when computation timings are accurate.
id	cern-1397891
institution	Organización Europea para la Investigación Nuclear
language	eng
publishDate	2011
publisher	Norwegian U. Sci. Tech.
record_format	invenio
spelling	cern-13978912019-09-30T06:29:59Zhttp://cds.cern.ch/record/1397891engLindal, Yngve SneenOptimizing a High Energy Physics (HEP) Toolkit on Heterogeneous ArchitecturesComputing and ComputersA desired trend within high energy physics is to increase particle accelerator luminosities, leading to production of more collision data and higher probabilities of finding interesting physics results. A central data analysis technique used to determine whether results are interesting or not is the maximum likelihood method, and the corresponding evaluation of the negative log-likelihood, which can be computationally expensive. As the amount of data grows, it is important to take benefit from the parallelism in modern computers. This, in essence, means to exploit vector registers and all available cores on CPUs, as well as utilizing co-processors as GPUs. This thesis describes the work done to optimize and parallelize a prototype of a central data analysis tool within the high energy physics community. The work consists of optimizations for multicore processors, GPUs, as well as a mechanism to balance the load between both CPUs and GPUs with the aim to fully exploit the power of modern commodity computers. We explore the OpenCL standard thoroughly and we give an overview of its limitations when used in a large real-world software package. We reach a single-core speedup of ∼ 7.8x compared to the original implementation of the toolkit for the physical model we use throughout this thesis. On top of that follows an increase of ∼ 3.6x with 4 threads on a commodity Intel processor, as well as almost perfect scalability on NUMA systems when thread affinity is applied. GPUs give varying speedups depending on the complexity of the physics model used. With our model, price-comparable GPUs give a speedup of ∼ 2.5x compared to a modern Intel CPU utilizing 8 SMT threads. The balancing mechanism is based on real timings of each device and works optimally for large workloads when the API calls to the OpenCL implementation impose a small overhead and when computation timings are accurate.Norwegian U. Sci. Tech.CERN-THESIS-2011-153oai:cds.cern.ch:13978912011
spellingShingle	Computing and Computers Lindal, Yngve Sneen Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
title	Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
title_full	Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
title_fullStr	Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
title_full_unstemmed	Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
title_short	Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures
title_sort	optimizing a high energy physics (hep) toolkit on heterogeneous architectures
topic	Computing and Computers
url	http://cds.cern.ch/record/1397891
work_keys_str_mv	AT lindalyngvesneen optimizingahighenergyphysicsheptoolkitonheterogeneousarchitectures

Optimizing a High Energy Physics (HEP) Toolkit on Heterogeneous Architectures

Ejemplares similares