Cargando…

Application of machine learning for the diagnosis of COVID-19

This chapter focuses on the application of machine learning algorithms on the diagnosis of the novel coronavirus disease (COVID-19). First, data visualization is provided on increases in confirmed deaths and recovered cases of COVID-19 using currently available data from Johns Hopkins University. Ne...

Descripción completa

Detalles Bibliográficos
Autores principales: Podder, Prajoy, Bharati, Subrato, Mondal, M. Rubaiyat Hossain, Kose, Utku
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137818/
http://dx.doi.org/10.1016/B978-0-12-824536-1.00008-3
_version_ 1783695678158929920
author Podder, Prajoy
Bharati, Subrato
Mondal, M. Rubaiyat Hossain
Kose, Utku
author_facet Podder, Prajoy
Bharati, Subrato
Mondal, M. Rubaiyat Hossain
Kose, Utku
author_sort Podder, Prajoy
collection PubMed
description This chapter focuses on the application of machine learning algorithms on the diagnosis of the novel coronavirus disease (COVID-19). First, data visualization is provided on increases in confirmed deaths and recovered cases of COVID-19 using currently available data from Johns Hopkins University. Next, the machine learning algorithms are used for the automatic diagnosis of COVID-19. Data-driven diagnosis is performed using a dataset of 5644 samples with 111 attributes provided by Hospital Israelita Albert Einstein, Brazil. As a preprocessing step, null values and categorical data are processed and standardization is performed. Next, feature selection is performed to find attributes that are most important for a COVID-19 diagnosis. A number of algorithms including random forest logistic regression, XGBoost, and decision tree are considered and their kernel parameters are optimized. The performance of classification algorithms is evaluated in terms of a number of factors including the testing accuracy, precision, recall, miss rate, receiver operating characteristic curve and area under the receiver operating characteristic curve. Experimental results show that serum glucose is the most influential attribute in predicting COVID-19. Our results also show that for the case of cross-validation, XGBoost has the highest accuracy value of 92.67% and logistic regressions have the second highest accuracy of 92.58%, whereas both XGBoost and LR have a 93% value for precision, recall, and F1 score. Moreover, for the case of the holdout method with 20% testing data, logistic regression with an accuracy of 94.06% outperforms other classifiers in terms of accuracy, precision, recall, and F1 score.
format Online
Article
Text
id pubmed-8137818
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-81378182021-05-21 Application of machine learning for the diagnosis of COVID-19 Podder, Prajoy Bharati, Subrato Mondal, M. Rubaiyat Hossain Kose, Utku Data Science for COVID-19 Article This chapter focuses on the application of machine learning algorithms on the diagnosis of the novel coronavirus disease (COVID-19). First, data visualization is provided on increases in confirmed deaths and recovered cases of COVID-19 using currently available data from Johns Hopkins University. Next, the machine learning algorithms are used for the automatic diagnosis of COVID-19. Data-driven diagnosis is performed using a dataset of 5644 samples with 111 attributes provided by Hospital Israelita Albert Einstein, Brazil. As a preprocessing step, null values and categorical data are processed and standardization is performed. Next, feature selection is performed to find attributes that are most important for a COVID-19 diagnosis. A number of algorithms including random forest logistic regression, XGBoost, and decision tree are considered and their kernel parameters are optimized. The performance of classification algorithms is evaluated in terms of a number of factors including the testing accuracy, precision, recall, miss rate, receiver operating characteristic curve and area under the receiver operating characteristic curve. Experimental results show that serum glucose is the most influential attribute in predicting COVID-19. Our results also show that for the case of cross-validation, XGBoost has the highest accuracy value of 92.67% and logistic regressions have the second highest accuracy of 92.58%, whereas both XGBoost and LR have a 93% value for precision, recall, and F1 score. Moreover, for the case of the holdout method with 20% testing data, logistic regression with an accuracy of 94.06% outperforms other classifiers in terms of accuracy, precision, recall, and F1 score. 2021 2021-05-21 /pmc/articles/PMC8137818/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00008-3 Text en Copyright © 2021 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Podder, Prajoy
Bharati, Subrato
Mondal, M. Rubaiyat Hossain
Kose, Utku
Application of machine learning for the diagnosis of COVID-19
title Application of machine learning for the diagnosis of COVID-19
title_full Application of machine learning for the diagnosis of COVID-19
title_fullStr Application of machine learning for the diagnosis of COVID-19
title_full_unstemmed Application of machine learning for the diagnosis of COVID-19
title_short Application of machine learning for the diagnosis of COVID-19
title_sort application of machine learning for the diagnosis of covid-19
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137818/
http://dx.doi.org/10.1016/B978-0-12-824536-1.00008-3
work_keys_str_mv AT podderprajoy applicationofmachinelearningforthediagnosisofcovid19
AT bharatisubrato applicationofmachinelearningforthediagnosisofcovid19
AT mondalmrubaiyathossain applicationofmachinelearningforthediagnosisofcovid19
AT koseutku applicationofmachinelearningforthediagnosisofcovid19