Cargando…
Application of machine learning for the diagnosis of COVID-19
This chapter focuses on the application of machine learning algorithms on the diagnosis of the novel coronavirus disease (COVID-19). First, data visualization is provided on increases in confirmed deaths and recovered cases of COVID-19 using currently available data from Johns Hopkins University. Ne...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137818/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00008-3 |
_version_ | 1783695678158929920 |
---|---|
author | Podder, Prajoy Bharati, Subrato Mondal, M. Rubaiyat Hossain Kose, Utku |
author_facet | Podder, Prajoy Bharati, Subrato Mondal, M. Rubaiyat Hossain Kose, Utku |
author_sort | Podder, Prajoy |
collection | PubMed |
description | This chapter focuses on the application of machine learning algorithms on the diagnosis of the novel coronavirus disease (COVID-19). First, data visualization is provided on increases in confirmed deaths and recovered cases of COVID-19 using currently available data from Johns Hopkins University. Next, the machine learning algorithms are used for the automatic diagnosis of COVID-19. Data-driven diagnosis is performed using a dataset of 5644 samples with 111 attributes provided by Hospital Israelita Albert Einstein, Brazil. As a preprocessing step, null values and categorical data are processed and standardization is performed. Next, feature selection is performed to find attributes that are most important for a COVID-19 diagnosis. A number of algorithms including random forest logistic regression, XGBoost, and decision tree are considered and their kernel parameters are optimized. The performance of classification algorithms is evaluated in terms of a number of factors including the testing accuracy, precision, recall, miss rate, receiver operating characteristic curve and area under the receiver operating characteristic curve. Experimental results show that serum glucose is the most influential attribute in predicting COVID-19. Our results also show that for the case of cross-validation, XGBoost has the highest accuracy value of 92.67% and logistic regressions have the second highest accuracy of 92.58%, whereas both XGBoost and LR have a 93% value for precision, recall, and F1 score. Moreover, for the case of the holdout method with 20% testing data, logistic regression with an accuracy of 94.06% outperforms other classifiers in terms of accuracy, precision, recall, and F1 score. |
format | Online Article Text |
id | pubmed-8137818 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-81378182021-05-21 Application of machine learning for the diagnosis of COVID-19 Podder, Prajoy Bharati, Subrato Mondal, M. Rubaiyat Hossain Kose, Utku Data Science for COVID-19 Article This chapter focuses on the application of machine learning algorithms on the diagnosis of the novel coronavirus disease (COVID-19). First, data visualization is provided on increases in confirmed deaths and recovered cases of COVID-19 using currently available data from Johns Hopkins University. Next, the machine learning algorithms are used for the automatic diagnosis of COVID-19. Data-driven diagnosis is performed using a dataset of 5644 samples with 111 attributes provided by Hospital Israelita Albert Einstein, Brazil. As a preprocessing step, null values and categorical data are processed and standardization is performed. Next, feature selection is performed to find attributes that are most important for a COVID-19 diagnosis. A number of algorithms including random forest logistic regression, XGBoost, and decision tree are considered and their kernel parameters are optimized. The performance of classification algorithms is evaluated in terms of a number of factors including the testing accuracy, precision, recall, miss rate, receiver operating characteristic curve and area under the receiver operating characteristic curve. Experimental results show that serum glucose is the most influential attribute in predicting COVID-19. Our results also show that for the case of cross-validation, XGBoost has the highest accuracy value of 92.67% and logistic regressions have the second highest accuracy of 92.58%, whereas both XGBoost and LR have a 93% value for precision, recall, and F1 score. Moreover, for the case of the holdout method with 20% testing data, logistic regression with an accuracy of 94.06% outperforms other classifiers in terms of accuracy, precision, recall, and F1 score. 2021 2021-05-21 /pmc/articles/PMC8137818/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00008-3 Text en Copyright © 2021 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Podder, Prajoy Bharati, Subrato Mondal, M. Rubaiyat Hossain Kose, Utku Application of machine learning for the diagnosis of COVID-19 |
title | Application of machine learning for the diagnosis of COVID-19 |
title_full | Application of machine learning for the diagnosis of COVID-19 |
title_fullStr | Application of machine learning for the diagnosis of COVID-19 |
title_full_unstemmed | Application of machine learning for the diagnosis of COVID-19 |
title_short | Application of machine learning for the diagnosis of COVID-19 |
title_sort | application of machine learning for the diagnosis of covid-19 |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137818/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00008-3 |
work_keys_str_mv | AT podderprajoy applicationofmachinelearningforthediagnosisofcovid19 AT bharatisubrato applicationofmachinelearningforthediagnosisofcovid19 AT mondalmrubaiyathossain applicationofmachinelearningforthediagnosisofcovid19 AT koseutku applicationofmachinelearningforthediagnosisofcovid19 |