Cargando…
Real-time pneumonia prediction using pipelined spark and high-performance computing
BACKGROUND: Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chance...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280684/ https://www.ncbi.nlm.nih.gov/pubmed/37346542 http://dx.doi.org/10.7717/peerj-cs.1258 |
_version_ | 1785060852006649856 |
---|---|
author | Ravikumar, Aswathy Sriraman, Harini |
author_facet | Ravikumar, Aswathy Sriraman, Harini |
author_sort | Ravikumar, Aswathy |
collection | PubMed |
description | BACKGROUND: Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. METHODS: In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark’s distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. RESULTS: The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes. |
format | Online Article Text |
id | pubmed-10280684 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-102806842023-06-21 Real-time pneumonia prediction using pipelined spark and high-performance computing Ravikumar, Aswathy Sriraman, Harini PeerJ Comput Sci Bioinformatics BACKGROUND: Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. METHODS: In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark’s distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. RESULTS: The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes. PeerJ Inc. 2023-03-09 /pmc/articles/PMC10280684/ /pubmed/37346542 http://dx.doi.org/10.7717/peerj-cs.1258 Text en © 2023 Ravikumar and Sriraman https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Bioinformatics Ravikumar, Aswathy Sriraman, Harini Real-time pneumonia prediction using pipelined spark and high-performance computing |
title | Real-time pneumonia prediction using pipelined spark and high-performance computing |
title_full | Real-time pneumonia prediction using pipelined spark and high-performance computing |
title_fullStr | Real-time pneumonia prediction using pipelined spark and high-performance computing |
title_full_unstemmed | Real-time pneumonia prediction using pipelined spark and high-performance computing |
title_short | Real-time pneumonia prediction using pipelined spark and high-performance computing |
title_sort | real-time pneumonia prediction using pipelined spark and high-performance computing |
topic | Bioinformatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280684/ https://www.ncbi.nlm.nih.gov/pubmed/37346542 http://dx.doi.org/10.7717/peerj-cs.1258 |
work_keys_str_mv | AT ravikumaraswathy realtimepneumoniapredictionusingpipelinedsparkandhighperformancecomputing AT sriramanharini realtimepneumoniapredictionusingpipelinedsparkandhighperformancecomputing |