Cargando…

High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models

X-ray crystallography is a powerful method that has significantly contributed to our understanding of the biological function of proteins and other molecules. This method relies on the production of crystals that, however, are usually a bottleneck in the process. For some molecules, no crystallizati...

Descripción completa

Detalles Bibliográficos
Autores principales: González, César, Balocco, Simone, Bosch, Jaume, de Haro, Juan Miguel, Paolini, Maurizio, Filgueras, Antonio, Álvarez, Carlos, Pons, Ramon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570381/
https://www.ncbi.nlm.nih.gov/pubmed/36232709
http://dx.doi.org/10.3390/ijms231911408
_version_ 1784810092194955264
author González, César
Balocco, Simone
Bosch, Jaume
de Haro, Juan Miguel
Paolini, Maurizio
Filgueras, Antonio
Álvarez, Carlos
Pons, Ramon
author_facet González, César
Balocco, Simone
Bosch, Jaume
de Haro, Juan Miguel
Paolini, Maurizio
Filgueras, Antonio
Álvarez, Carlos
Pons, Ramon
author_sort González, César
collection PubMed
description X-ray crystallography is a powerful method that has significantly contributed to our understanding of the biological function of proteins and other molecules. This method relies on the production of crystals that, however, are usually a bottleneck in the process. For some molecules, no crystallization has been achieved or insufficient crystals were obtained. Some other systems do not crystallize at all, such as nanoparticles which, because of their dimensions, cannot be treated by the usual crystallographic methods. To solve this, whole pair distribution function has been proposed to bridge the gap between Bragg and Debye scattering theories. To execute a fitting, the spectra of several different constructs, composed of millions of particles each, should be computed using a particle–pair or particle–particle (pp) distance algorithm. Using this computation as a test bench for current field-programmable gate array (FPGA) technology, we evaluate how the parallel computation capability of FPGAs can be exploited to reduce the computation time. We present two different solutions to the problem using two state-of-the-art FPGA technologies. In the first one, the main C program uses OmpSs (a high-level programming model developed at the Barcelona Supercomputing Center, that enables task offload to different high-performance computing devices) for task invocation, and kernels are built with OpenCL using reduced data sizes to save transmission time. The second approach uses task and data parallelism to operate on data locally and update data globally in a decoupled task. Benchmarks have been evaluated over an Intel D5005 Programmable Acceleration Card, computing a model of 2 million particles in 81.57 s – 24.5 billion atom pairs per second (bapps)– and over a ZU102 in 115.31 s. In our last test, over an up-to-date Alveo U200 board, the computation lasted for 34.68 s (57.67 bapps). In this study, we analyze the results in relation to the classic terms of speed-up and efficiency and give hints for future improvements focused on reducing the global job time.
format Online
Article
Text
id pubmed-9570381
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-95703812022-10-17 High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models González, César Balocco, Simone Bosch, Jaume de Haro, Juan Miguel Paolini, Maurizio Filgueras, Antonio Álvarez, Carlos Pons, Ramon Int J Mol Sci Article X-ray crystallography is a powerful method that has significantly contributed to our understanding of the biological function of proteins and other molecules. This method relies on the production of crystals that, however, are usually a bottleneck in the process. For some molecules, no crystallization has been achieved or insufficient crystals were obtained. Some other systems do not crystallize at all, such as nanoparticles which, because of their dimensions, cannot be treated by the usual crystallographic methods. To solve this, whole pair distribution function has been proposed to bridge the gap between Bragg and Debye scattering theories. To execute a fitting, the spectra of several different constructs, composed of millions of particles each, should be computed using a particle–pair or particle–particle (pp) distance algorithm. Using this computation as a test bench for current field-programmable gate array (FPGA) technology, we evaluate how the parallel computation capability of FPGAs can be exploited to reduce the computation time. We present two different solutions to the problem using two state-of-the-art FPGA technologies. In the first one, the main C program uses OmpSs (a high-level programming model developed at the Barcelona Supercomputing Center, that enables task offload to different high-performance computing devices) for task invocation, and kernels are built with OpenCL using reduced data sizes to save transmission time. The second approach uses task and data parallelism to operate on data locally and update data globally in a decoupled task. Benchmarks have been evaluated over an Intel D5005 Programmable Acceleration Card, computing a model of 2 million particles in 81.57 s – 24.5 billion atom pairs per second (bapps)– and over a ZU102 in 115.31 s. In our last test, over an up-to-date Alveo U200 board, the computation lasted for 34.68 s (57.67 bapps). In this study, we analyze the results in relation to the classic terms of speed-up and efficiency and give hints for future improvements focused on reducing the global job time. MDPI 2022-09-27 /pmc/articles/PMC9570381/ /pubmed/36232709 http://dx.doi.org/10.3390/ijms231911408 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
González, César
Balocco, Simone
Bosch, Jaume
de Haro, Juan Miguel
Paolini, Maurizio
Filgueras, Antonio
Álvarez, Carlos
Pons, Ramon
High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models
title High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models
title_full High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models
title_fullStr High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models
title_full_unstemmed High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models
title_short High Performance Computing PP-Distance Algorithms to Generate X-ray Spectra from 3D Models
title_sort high performance computing pp-distance algorithms to generate x-ray spectra from 3d models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9570381/
https://www.ncbi.nlm.nih.gov/pubmed/36232709
http://dx.doi.org/10.3390/ijms231911408
work_keys_str_mv AT gonzalezcesar highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT baloccosimone highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT boschjaume highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT deharojuanmiguel highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT paolinimaurizio highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT filguerasantonio highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT alvarezcarlos highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels
AT ponsramon highperformancecomputingppdistancealgorithmstogeneratexrayspectrafrom3dmodels