Cargando…

Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives

This work examines performance characteristics of multiple shared-memory implementations of a probabilistic graphical modeling (PGM) optimization code, which forms the basis for an advanced, state-of-the art image segmentation method. The work is motivated by the need to accelerate scientific image...

Descripción completa

Detalles Bibliográficos
Autores principales: Perciano, Talita, Heinemann, Colleen, Camp, David, Lessley, Brenton, Bethel, E. Wes
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295342/
http://dx.doi.org/10.1007/978-3-030-50743-5_7
_version_ 1783546632400273408
author Perciano, Talita
Heinemann, Colleen
Camp, David
Lessley, Brenton
Bethel, E. Wes
author_facet Perciano, Talita
Heinemann, Colleen
Camp, David
Lessley, Brenton
Bethel, E. Wes
author_sort Perciano, Talita
collection PubMed
description This work examines performance characteristics of multiple shared-memory implementations of a probabilistic graphical modeling (PGM) optimization code, which forms the basis for an advanced, state-of-the art image segmentation method. The work is motivated by the need to accelerate scientific image analysis pipelines in use by experimental science, such as at x-ray light sources, and is motivated by the need for platform-portable codes that perform well across many different computational architectures. The primary focus of this work and its main contribution is an in-depth study of shared-memory parallel performance of different implementations, which include those using alternative parallelization approaches such as C11-threads, OpenMP, and data parallel primitives (DPPs). Our results show that, for this complex data-intensive algorithm, the DPP implementation exhibits better runtime performance, but also exhibits less favorable scaling characteristics than the C11-threads and OpenMP counterparts. Based upon a set of experiments that collect hardware performance counters on multiple platforms, the reason for the runtime performance difference appears to be due primarily to algorithmic efficiency gains: the reformulation from the traditional C11-threads and OpenMP expression of the solution into that of data parallel primitives results in significantly fewer instructions being executed. This study is the first of its type to do performance analysis using hardware counters for comparing methods based on VTK-m-based data-parallel primitives with those based on more traditional OpenMP or threads-based parallelism. It is timely, as there is increasing awareness of the need for platform portability in light of increasing node-level parallelism and increasing device heterogeneity.
format Online
Article
Text
id pubmed-7295342
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72953422020-06-16 Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives Perciano, Talita Heinemann, Colleen Camp, David Lessley, Brenton Bethel, E. Wes High Performance Computing Article This work examines performance characteristics of multiple shared-memory implementations of a probabilistic graphical modeling (PGM) optimization code, which forms the basis for an advanced, state-of-the art image segmentation method. The work is motivated by the need to accelerate scientific image analysis pipelines in use by experimental science, such as at x-ray light sources, and is motivated by the need for platform-portable codes that perform well across many different computational architectures. The primary focus of this work and its main contribution is an in-depth study of shared-memory parallel performance of different implementations, which include those using alternative parallelization approaches such as C11-threads, OpenMP, and data parallel primitives (DPPs). Our results show that, for this complex data-intensive algorithm, the DPP implementation exhibits better runtime performance, but also exhibits less favorable scaling characteristics than the C11-threads and OpenMP counterparts. Based upon a set of experiments that collect hardware performance counters on multiple platforms, the reason for the runtime performance difference appears to be due primarily to algorithmic efficiency gains: the reformulation from the traditional C11-threads and OpenMP expression of the solution into that of data parallel primitives results in significantly fewer instructions being executed. This study is the first of its type to do performance analysis using hardware counters for comparing methods based on VTK-m-based data-parallel primitives with those based on more traditional OpenMP or threads-based parallelism. It is timely, as there is increasing awareness of the need for platform portability in light of increasing node-level parallelism and increasing device heterogeneity. 2020-05-22 /pmc/articles/PMC7295342/ http://dx.doi.org/10.1007/978-3-030-50743-5_7 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Perciano, Talita
Heinemann, Colleen
Camp, David
Lessley, Brenton
Bethel, E. Wes
Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives
title Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives
title_full Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives
title_fullStr Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives
title_full_unstemmed Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives
title_short Shared-Memory Parallel Probabilistic Graphical Modeling Optimization: Comparison of Threads, OpenMP, and Data-Parallel Primitives
title_sort shared-memory parallel probabilistic graphical modeling optimization: comparison of threads, openmp, and data-parallel primitives
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7295342/
http://dx.doi.org/10.1007/978-3-030-50743-5_7
work_keys_str_mv AT percianotalita sharedmemoryparallelprobabilisticgraphicalmodelingoptimizationcomparisonofthreadsopenmpanddataparallelprimitives
AT heinemanncolleen sharedmemoryparallelprobabilisticgraphicalmodelingoptimizationcomparisonofthreadsopenmpanddataparallelprimitives
AT campdavid sharedmemoryparallelprobabilisticgraphicalmodelingoptimizationcomparisonofthreadsopenmpanddataparallelprimitives
AT lessleybrenton sharedmemoryparallelprobabilisticgraphicalmodelingoptimizationcomparisonofthreadsopenmpanddataparallelprimitives
AT bethelewes sharedmemoryparallelprobabilisticgraphicalmodelingoptimizationcomparisonofthreadsopenmpanddataparallelprimitives