Cargando…

Real-time pneumonia prediction using pipelined spark and high-performance computing

BACKGROUND: Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chance...

Descripción completa

Detalles Bibliográficos
Autores principales: Ravikumar, Aswathy, Sriraman, Harini
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280684/
https://www.ncbi.nlm.nih.gov/pubmed/37346542
http://dx.doi.org/10.7717/peerj-cs.1258
_version_ 1785060852006649856
author Ravikumar, Aswathy
Sriraman, Harini
author_facet Ravikumar, Aswathy
Sriraman, Harini
author_sort Ravikumar, Aswathy
collection PubMed
description BACKGROUND: Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. METHODS: In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark’s distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. RESULTS: The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes.
format Online
Article
Text
id pubmed-10280684
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-102806842023-06-21 Real-time pneumonia prediction using pipelined spark and high-performance computing Ravikumar, Aswathy Sriraman, Harini PeerJ Comput Sci Bioinformatics BACKGROUND: Pneumonia is a respiratory disease caused by bacteria; it affects many people, particularly in impoverished countries where pollution, unclean living standards, overpopulation, and insufficient medical infrastructures are prevalent. To guarantee curative therapy and boost survival chances, it is vital to detect pneumonia soon enough. Imaging using chest X-rays is the most common way of detecting pneumonia. However, analyzing chest X-rays is a complex process vulnerable to subjective variation. Moreover, the data available is growing exponentially, and it will take hours and days to train the model to predict pneumonia. Timely prediction is significant to guarantee a better cure and treatment. Existing work provided by different authors needs more precision, and the computation time for predicting pneumonia is also much longer. Therefore, there is a requirement for early forecasting. Using X-ray picture samples, the system must have a continuous and unsupervised learning system for early diagnosis. METHODS: In this article, the training time of the model is accelerated using the distributed data-parallel approach and the computational power of high-performance computing devices. This research aims to diagnose pneumonia using X-ray pictures with more precision, greater speed, and fewer processing resources. Distributed deep learning techniques are gaining popularity owing to the rising need for computational resources for deep learning models with several parameters. In contrast to conventional training methods, data-parallel training enables several compute nodes to train massive deep-learning models to improve training efficiency concurrently. Deploying the model in Spark solves the scalability and acceleration. Spark’s distributed processing capability reads data from multiple nodes, and the results demonstrate that training time can be drastically reduced by utilizing these techniques, which is a significant necessity when dealing with large datasets. RESULTS: The proposed model makes the prediction 1.5 times faster than the traditional CNN model used for pneumonia prediction. The model also achieved an accuracy of 98.72%. The speed-up varying from 1.2 to 1.5 was obtained in the synchronous and asynchronous parallel model. The speed-up is reduced in the parallel asynchronous model due to the presence of straggler nodes. PeerJ Inc. 2023-03-09 /pmc/articles/PMC10280684/ /pubmed/37346542 http://dx.doi.org/10.7717/peerj-cs.1258 Text en © 2023 Ravikumar and Sriraman https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Ravikumar, Aswathy
Sriraman, Harini
Real-time pneumonia prediction using pipelined spark and high-performance computing
title Real-time pneumonia prediction using pipelined spark and high-performance computing
title_full Real-time pneumonia prediction using pipelined spark and high-performance computing
title_fullStr Real-time pneumonia prediction using pipelined spark and high-performance computing
title_full_unstemmed Real-time pneumonia prediction using pipelined spark and high-performance computing
title_short Real-time pneumonia prediction using pipelined spark and high-performance computing
title_sort real-time pneumonia prediction using pipelined spark and high-performance computing
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280684/
https://www.ncbi.nlm.nih.gov/pubmed/37346542
http://dx.doi.org/10.7717/peerj-cs.1258
work_keys_str_mv AT ravikumaraswathy realtimepneumoniapredictionusingpipelinedsparkandhighperformancecomputing
AT sriramanharini realtimepneumoniapredictionusingpipelinedsparkandhighperformancecomputing