Cargando…

Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem

Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts the efficiency of high performance computing (HPC) applications. However, such a low-level optimization is in general challenging, especially when using popular scientific file formats designed with an...

Descripción completa

Detalles Bibliográficos
Autores principales: Cardone-Noott, Louie, Rodriguez, Blanca, Bueno-Orovio, Alfonso
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6107169/
https://www.ncbi.nlm.nih.gov/pubmed/30138401
http://dx.doi.org/10.1371/journal.pone.0202410
_version_ 1783349925225955328
author Cardone-Noott, Louie
Rodriguez, Blanca
Bueno-Orovio, Alfonso
author_facet Cardone-Noott, Louie
Rodriguez, Blanca
Bueno-Orovio, Alfonso
author_sort Cardone-Noott, Louie
collection PubMed
description Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts the efficiency of high performance computing (HPC) applications. However, such a low-level optimization is in general challenging, especially when using popular scientific file formats designed with an emphasis on portability and flexibility. To reconcile these two aspects, we present a novel low-level data layout for HPC applications, fully independent of the number of dimensions in the dataset. The new data layout improves reading and writing efficiency in large HPC applications using many processors, and in particular during parallel post-processing. Furthermore, its combination with a cached write mode, in order to aggregate multiple writes into larger ones, substantially decreased the writing times of the proposed strategy. When applied to our simulation framework for the forward calculation of the human electrocardiogram, the combined strategy resulted in drastic improvements in I/O performance, of up to 40% in writing and 93–98% in reading for post-processing tasks. Given the generality of the proposed strategies and scientific file formats used, our results may represent significant improvements in I/O performance of HPC applications across multiple disciplines, reducing execution and post-processing times and leading to a more efficient use of HPC resource envelopes.
format Online
Article
Text
id pubmed-6107169
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-61071692018-08-30 Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem Cardone-Noott, Louie Rodriguez, Blanca Bueno-Orovio, Alfonso PLoS One Research Article Input-output (I/O) optimization at the low-level design of data layout on disk drastically impacts the efficiency of high performance computing (HPC) applications. However, such a low-level optimization is in general challenging, especially when using popular scientific file formats designed with an emphasis on portability and flexibility. To reconcile these two aspects, we present a novel low-level data layout for HPC applications, fully independent of the number of dimensions in the dataset. The new data layout improves reading and writing efficiency in large HPC applications using many processors, and in particular during parallel post-processing. Furthermore, its combination with a cached write mode, in order to aggregate multiple writes into larger ones, substantially decreased the writing times of the proposed strategy. When applied to our simulation framework for the forward calculation of the human electrocardiogram, the combined strategy resulted in drastic improvements in I/O performance, of up to 40% in writing and 93–98% in reading for post-processing tasks. Given the generality of the proposed strategies and scientific file formats used, our results may represent significant improvements in I/O performance of HPC applications across multiple disciplines, reducing execution and post-processing times and leading to a more efficient use of HPC resource envelopes. Public Library of Science 2018-08-23 /pmc/articles/PMC6107169/ /pubmed/30138401 http://dx.doi.org/10.1371/journal.pone.0202410 Text en © 2018 Cardone-Noott et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Cardone-Noott, Louie
Rodriguez, Blanca
Bueno-Orovio, Alfonso
Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem
title Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem
title_full Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem
title_fullStr Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem
title_full_unstemmed Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem
title_short Strategies of data layout and cache writing for input-output optimization in high performance scientific computing: Applications to the forward electrocardiographic problem
title_sort strategies of data layout and cache writing for input-output optimization in high performance scientific computing: applications to the forward electrocardiographic problem
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6107169/
https://www.ncbi.nlm.nih.gov/pubmed/30138401
http://dx.doi.org/10.1371/journal.pone.0202410
work_keys_str_mv AT cardonenoottlouie strategiesofdatalayoutandcachewritingforinputoutputoptimizationinhighperformancescientificcomputingapplicationstotheforwardelectrocardiographicproblem
AT rodriguezblanca strategiesofdatalayoutandcachewritingforinputoutputoptimizationinhighperformancescientificcomputingapplicationstotheforwardelectrocardiographicproblem
AT buenoorovioalfonso strategiesofdatalayoutandcachewritingforinputoutputoptimizationinhighperformancescientificcomputingapplicationstotheforwardelectrocardiographicproblem