Cargando…

Performance analysis of data mining algorithms for diagnosing COVID-19

BACKGROUND: An outbreak of atypical pneumonia termed COVID-19 has widely spread all over the world since the beginning of 2020. In this regard, designing a prediction system for the early detection of COVID-19 is a critical issue in mitigating virus spread. In this study, we have applied selected ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Nopour, Raoof, Kazemi-Arpanahi, Hadi, Shanbehzadeh, Mostafa, Azizifar, Akbar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Wolters Kluwer - Medknow 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8719570/
https://www.ncbi.nlm.nih.gov/pubmed/35071611
http://dx.doi.org/10.4103/jehp.jehp_138_21
_version_ 1784624963096936448
author Nopour, Raoof
Kazemi-Arpanahi, Hadi
Shanbehzadeh, Mostafa
Azizifar, Akbar
author_facet Nopour, Raoof
Kazemi-Arpanahi, Hadi
Shanbehzadeh, Mostafa
Azizifar, Akbar
author_sort Nopour, Raoof
collection PubMed
description BACKGROUND: An outbreak of atypical pneumonia termed COVID-19 has widely spread all over the world since the beginning of 2020. In this regard, designing a prediction system for the early detection of COVID-19 is a critical issue in mitigating virus spread. In this study, we have applied selected machine learning techniques to select the best predictive models based on their performance. MATERIALS AND METHODS: The data of 435 suspicious cases with COVID-19 which were recorded from the Imam Khomeini Hospital database between May 9, 2020 and December 20, 2020, have been taken into consideration. The Chi-square method was used to determine the most important features in diagnosing the COVID-19; eight selected data mining algorithms including multilayer perceptron (MLP), J-48, Bayesian Net (Bayes Net), logistic regression, K-star, random forest, Ada-boost, and sequential minimal optimization (SMO) were applied in data mining. Finally, the most appropriate diagnostic model for COVID-19 was obtained based on comparing the performance of the selected algorithms. RESULTS: As the result of using the Chi-square method, 21 variables were identified as the most important diagnostic criteria in COVID-19. The results of evaluating the eight selected data mining algorithms showed that the J-48 with true-positive rate = 0.85, false-positive rate = 0.173, precision = 0.85, recall = 0.85, F-score = 0.85, Matthews Correlation Coefficient = 0.68, and area under the receiver operator characteristics = 0.68, respectively, had the higher performance than the other algorithms. CONCLUSION: The results of evaluating the performance criteria showed that the J-48 can be considered as a suitable computational prediction model for diagnosing COVID-19 disease.
format Online
Article
Text
id pubmed-8719570
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Wolters Kluwer - Medknow
record_format MEDLINE/PubMed
spelling pubmed-87195702022-01-20 Performance analysis of data mining algorithms for diagnosing COVID-19 Nopour, Raoof Kazemi-Arpanahi, Hadi Shanbehzadeh, Mostafa Azizifar, Akbar J Educ Health Promot Original Article BACKGROUND: An outbreak of atypical pneumonia termed COVID-19 has widely spread all over the world since the beginning of 2020. In this regard, designing a prediction system for the early detection of COVID-19 is a critical issue in mitigating virus spread. In this study, we have applied selected machine learning techniques to select the best predictive models based on their performance. MATERIALS AND METHODS: The data of 435 suspicious cases with COVID-19 which were recorded from the Imam Khomeini Hospital database between May 9, 2020 and December 20, 2020, have been taken into consideration. The Chi-square method was used to determine the most important features in diagnosing the COVID-19; eight selected data mining algorithms including multilayer perceptron (MLP), J-48, Bayesian Net (Bayes Net), logistic regression, K-star, random forest, Ada-boost, and sequential minimal optimization (SMO) were applied in data mining. Finally, the most appropriate diagnostic model for COVID-19 was obtained based on comparing the performance of the selected algorithms. RESULTS: As the result of using the Chi-square method, 21 variables were identified as the most important diagnostic criteria in COVID-19. The results of evaluating the eight selected data mining algorithms showed that the J-48 with true-positive rate = 0.85, false-positive rate = 0.173, precision = 0.85, recall = 0.85, F-score = 0.85, Matthews Correlation Coefficient = 0.68, and area under the receiver operator characteristics = 0.68, respectively, had the higher performance than the other algorithms. CONCLUSION: The results of evaluating the performance criteria showed that the J-48 can be considered as a suitable computational prediction model for diagnosing COVID-19 disease. Wolters Kluwer - Medknow 2021-11-30 /pmc/articles/PMC8719570/ /pubmed/35071611 http://dx.doi.org/10.4103/jehp.jehp_138_21 Text en Copyright: © 2021 Journal of Education and Health Promotion https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms.
spellingShingle Original Article
Nopour, Raoof
Kazemi-Arpanahi, Hadi
Shanbehzadeh, Mostafa
Azizifar, Akbar
Performance analysis of data mining algorithms for diagnosing COVID-19
title Performance analysis of data mining algorithms for diagnosing COVID-19
title_full Performance analysis of data mining algorithms for diagnosing COVID-19
title_fullStr Performance analysis of data mining algorithms for diagnosing COVID-19
title_full_unstemmed Performance analysis of data mining algorithms for diagnosing COVID-19
title_short Performance analysis of data mining algorithms for diagnosing COVID-19
title_sort performance analysis of data mining algorithms for diagnosing covid-19
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8719570/
https://www.ncbi.nlm.nih.gov/pubmed/35071611
http://dx.doi.org/10.4103/jehp.jehp_138_21
work_keys_str_mv AT nopourraoof performanceanalysisofdataminingalgorithmsfordiagnosingcovid19
AT kazemiarpanahihadi performanceanalysisofdataminingalgorithmsfordiagnosingcovid19
AT shanbehzadehmostafa performanceanalysisofdataminingalgorithmsfordiagnosingcovid19
AT azizifarakbar performanceanalysisofdataminingalgorithmsfordiagnosingcovid19