Cargando…
Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat
HPC or super-computing clusters are designed for executing computationally intensive operations that typically involve large scale I/O operations. This most commonly involves using a standard MPI library implemented in C/C++. The MPI-I/O performance in HPC clusters tends to vary significantly over a...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9283189/ https://www.ncbi.nlm.nih.gov/pubmed/35855775 http://dx.doi.org/10.1007/s10586-021-03347-8 |
_version_ | 1784747280245456896 |
---|---|
author | Tipu, Abdul Jabbar Saeed Conbhuí, Padraig Ó Howley, Enda |
author_facet | Tipu, Abdul Jabbar Saeed Conbhuí, Padraig Ó Howley, Enda |
author_sort | Tipu, Abdul Jabbar Saeed |
collection | PubMed |
description | HPC or super-computing clusters are designed for executing computationally intensive operations that typically involve large scale I/O operations. This most commonly involves using a standard MPI library implemented in C/C++. The MPI-I/O performance in HPC clusters tends to vary significantly over a range of configuration parameters that are generally not taken into account by the algorithm. It is commonly left to individual practitioners to optimise I/O on a case by case basis at code level. This can often lead to a range of unforeseen outcomes. The ExSeisDat utility is built on top of the native MPI-I/O library comprising of Parallel I/O and Workflow Libraries to process seismic data encapsulated in SEG-Y file format. The SEG-Y File data structure is complex in nature, due to the alternative arrangement of trace header and trace data. Its size scales to petabytes and the chances of I/O performance degradation are further increased by ExSeisDat. This research paper presents a novel study of the changing I/O performance in terms of bandwidth, with the use of parallel plots against various MPI-I/O, Lustre (Parallel) File System and SEG-Y File parameters. Another novel aspect of this research is the predictive modelling of MPI-I/O behaviour over SEG-Y File benchmarks using Artificial Neural Networks (ANNs). The accuracy ranges from 62.5% to 96.5% over the set of trained ANN models. The computed Mean Square Error (MSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) values further support the generalisation of the prediction models. This paper demonstrates that by using our ANNs prediction technique, the configurations can be tuned beforehand to avoid poor I/O performance. |
format | Online Article Text |
id | pubmed-9283189 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-92831892022-07-16 Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat Tipu, Abdul Jabbar Saeed Conbhuí, Padraig Ó Howley, Enda Cluster Comput Article HPC or super-computing clusters are designed for executing computationally intensive operations that typically involve large scale I/O operations. This most commonly involves using a standard MPI library implemented in C/C++. The MPI-I/O performance in HPC clusters tends to vary significantly over a range of configuration parameters that are generally not taken into account by the algorithm. It is commonly left to individual practitioners to optimise I/O on a case by case basis at code level. This can often lead to a range of unforeseen outcomes. The ExSeisDat utility is built on top of the native MPI-I/O library comprising of Parallel I/O and Workflow Libraries to process seismic data encapsulated in SEG-Y file format. The SEG-Y File data structure is complex in nature, due to the alternative arrangement of trace header and trace data. Its size scales to petabytes and the chances of I/O performance degradation are further increased by ExSeisDat. This research paper presents a novel study of the changing I/O performance in terms of bandwidth, with the use of parallel plots against various MPI-I/O, Lustre (Parallel) File System and SEG-Y File parameters. Another novel aspect of this research is the predictive modelling of MPI-I/O behaviour over SEG-Y File benchmarks using Artificial Neural Networks (ANNs). The accuracy ranges from 62.5% to 96.5% over the set of trained ANN models. The computed Mean Square Error (MSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) values further support the generalisation of the prediction models. This paper demonstrates that by using our ANNs prediction technique, the configurations can be tuned beforehand to avoid poor I/O performance. Springer US 2021-07-02 2022 /pmc/articles/PMC9283189/ /pubmed/35855775 http://dx.doi.org/10.1007/s10586-021-03347-8 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Tipu, Abdul Jabbar Saeed Conbhuí, Padraig Ó Howley, Enda Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat |
title | Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat |
title_full | Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat |
title_fullStr | Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat |
title_full_unstemmed | Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat |
title_short | Applying neural networks to predict HPC-I/O bandwidth over seismic data on lustre file system for ExSeisDat |
title_sort | applying neural networks to predict hpc-i/o bandwidth over seismic data on lustre file system for exseisdat |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9283189/ https://www.ncbi.nlm.nih.gov/pubmed/35855775 http://dx.doi.org/10.1007/s10586-021-03347-8 |
work_keys_str_mv | AT tipuabduljabbarsaeed applyingneuralnetworkstopredicthpciobandwidthoverseismicdataonlustrefilesystemforexseisdat AT conbhuipadraigo applyingneuralnetworkstopredicthpciobandwidthoverseismicdataonlustrefilesystemforexseisdat AT howleyenda applyingneuralnetworkstopredicthpciobandwidthoverseismicdataonlustrefilesystemforexseisdat |