Cargando…
Performance analysis of data mining algorithms for diagnosing COVID-19
BACKGROUND: An outbreak of atypical pneumonia termed COVID-19 has widely spread all over the world since the beginning of 2020. In this regard, designing a prediction system for the early detection of COVID-19 is a critical issue in mitigating virus spread. In this study, we have applied selected ma...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Wolters Kluwer - Medknow
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8719570/ https://www.ncbi.nlm.nih.gov/pubmed/35071611 http://dx.doi.org/10.4103/jehp.jehp_138_21 |
_version_ | 1784624963096936448 |
---|---|
author | Nopour, Raoof Kazemi-Arpanahi, Hadi Shanbehzadeh, Mostafa Azizifar, Akbar |
author_facet | Nopour, Raoof Kazemi-Arpanahi, Hadi Shanbehzadeh, Mostafa Azizifar, Akbar |
author_sort | Nopour, Raoof |
collection | PubMed |
description | BACKGROUND: An outbreak of atypical pneumonia termed COVID-19 has widely spread all over the world since the beginning of 2020. In this regard, designing a prediction system for the early detection of COVID-19 is a critical issue in mitigating virus spread. In this study, we have applied selected machine learning techniques to select the best predictive models based on their performance. MATERIALS AND METHODS: The data of 435 suspicious cases with COVID-19 which were recorded from the Imam Khomeini Hospital database between May 9, 2020 and December 20, 2020, have been taken into consideration. The Chi-square method was used to determine the most important features in diagnosing the COVID-19; eight selected data mining algorithms including multilayer perceptron (MLP), J-48, Bayesian Net (Bayes Net), logistic regression, K-star, random forest, Ada-boost, and sequential minimal optimization (SMO) were applied in data mining. Finally, the most appropriate diagnostic model for COVID-19 was obtained based on comparing the performance of the selected algorithms. RESULTS: As the result of using the Chi-square method, 21 variables were identified as the most important diagnostic criteria in COVID-19. The results of evaluating the eight selected data mining algorithms showed that the J-48 with true-positive rate = 0.85, false-positive rate = 0.173, precision = 0.85, recall = 0.85, F-score = 0.85, Matthews Correlation Coefficient = 0.68, and area under the receiver operator characteristics = 0.68, respectively, had the higher performance than the other algorithms. CONCLUSION: The results of evaluating the performance criteria showed that the J-48 can be considered as a suitable computational prediction model for diagnosing COVID-19 disease. |
format | Online Article Text |
id | pubmed-8719570 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Wolters Kluwer - Medknow |
record_format | MEDLINE/PubMed |
spelling | pubmed-87195702022-01-20 Performance analysis of data mining algorithms for diagnosing COVID-19 Nopour, Raoof Kazemi-Arpanahi, Hadi Shanbehzadeh, Mostafa Azizifar, Akbar J Educ Health Promot Original Article BACKGROUND: An outbreak of atypical pneumonia termed COVID-19 has widely spread all over the world since the beginning of 2020. In this regard, designing a prediction system for the early detection of COVID-19 is a critical issue in mitigating virus spread. In this study, we have applied selected machine learning techniques to select the best predictive models based on their performance. MATERIALS AND METHODS: The data of 435 suspicious cases with COVID-19 which were recorded from the Imam Khomeini Hospital database between May 9, 2020 and December 20, 2020, have been taken into consideration. The Chi-square method was used to determine the most important features in diagnosing the COVID-19; eight selected data mining algorithms including multilayer perceptron (MLP), J-48, Bayesian Net (Bayes Net), logistic regression, K-star, random forest, Ada-boost, and sequential minimal optimization (SMO) were applied in data mining. Finally, the most appropriate diagnostic model for COVID-19 was obtained based on comparing the performance of the selected algorithms. RESULTS: As the result of using the Chi-square method, 21 variables were identified as the most important diagnostic criteria in COVID-19. The results of evaluating the eight selected data mining algorithms showed that the J-48 with true-positive rate = 0.85, false-positive rate = 0.173, precision = 0.85, recall = 0.85, F-score = 0.85, Matthews Correlation Coefficient = 0.68, and area under the receiver operator characteristics = 0.68, respectively, had the higher performance than the other algorithms. CONCLUSION: The results of evaluating the performance criteria showed that the J-48 can be considered as a suitable computational prediction model for diagnosing COVID-19 disease. Wolters Kluwer - Medknow 2021-11-30 /pmc/articles/PMC8719570/ /pubmed/35071611 http://dx.doi.org/10.4103/jehp.jehp_138_21 Text en Copyright: © 2021 Journal of Education and Health Promotion https://creativecommons.org/licenses/by-nc-sa/4.0/This is an open access journal, and articles are distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as appropriate credit is given and the new creations are licensed under the identical terms. |
spellingShingle | Original Article Nopour, Raoof Kazemi-Arpanahi, Hadi Shanbehzadeh, Mostafa Azizifar, Akbar Performance analysis of data mining algorithms for diagnosing COVID-19 |
title | Performance analysis of data mining algorithms for diagnosing COVID-19 |
title_full | Performance analysis of data mining algorithms for diagnosing COVID-19 |
title_fullStr | Performance analysis of data mining algorithms for diagnosing COVID-19 |
title_full_unstemmed | Performance analysis of data mining algorithms for diagnosing COVID-19 |
title_short | Performance analysis of data mining algorithms for diagnosing COVID-19 |
title_sort | performance analysis of data mining algorithms for diagnosing covid-19 |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8719570/ https://www.ncbi.nlm.nih.gov/pubmed/35071611 http://dx.doi.org/10.4103/jehp.jehp_138_21 |
work_keys_str_mv | AT nopourraoof performanceanalysisofdataminingalgorithmsfordiagnosingcovid19 AT kazemiarpanahihadi performanceanalysisofdataminingalgorithmsfordiagnosingcovid19 AT shanbehzadehmostafa performanceanalysisofdataminingalgorithmsfordiagnosingcovid19 AT azizifarakbar performanceanalysisofdataminingalgorithmsfordiagnosingcovid19 |