Cargando…

An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection

METHODS: Our analysis and machine learning algorithm is based on most cited two clinical datasets from the literature: one from San Raffaele Hospital Milan Italia and the other from Hospital Israelita Albert Einstein São Paulo Brasilia. The datasets were processed to select the best features that mo...

Descripción completa

Detalles Bibliográficos
Autores principales: Tchagna Kouanou, Aurelle, Mih Attia, Thomas, Feudjio, Cyrille, Djeumo, Anges Fleurio, Ngo Mouelas, Adèle, Nzogang, Mendel Patrice, Tchito Tchapga, Christian, Tchiotsop, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8629644/
https://www.ncbi.nlm.nih.gov/pubmed/34853669
http://dx.doi.org/10.1155/2021/4733167
_version_ 1784607252676608000
author Tchagna Kouanou, Aurelle
Mih Attia, Thomas
Feudjio, Cyrille
Djeumo, Anges Fleurio
Ngo Mouelas, Adèle
Nzogang, Mendel Patrice
Tchito Tchapga, Christian
Tchiotsop, Daniel
author_facet Tchagna Kouanou, Aurelle
Mih Attia, Thomas
Feudjio, Cyrille
Djeumo, Anges Fleurio
Ngo Mouelas, Adèle
Nzogang, Mendel Patrice
Tchito Tchapga, Christian
Tchiotsop, Daniel
author_sort Tchagna Kouanou, Aurelle
collection PubMed
description METHODS: Our analysis and machine learning algorithm is based on most cited two clinical datasets from the literature: one from San Raffaele Hospital Milan Italia and the other from Hospital Israelita Albert Einstein São Paulo Brasilia. The datasets were processed to select the best features that most influence the target, and it turned out that almost all of them are blood parameters. EDA (Exploratory Data Analysis) methods were applied to the datasets, and a comparative study of supervised machine learning models was done, after which the support vector machine (SVM) was selected as the one with the best performance. RESULTS: SVM being the best performant is used as our proposed supervised machine learning algorithm. An accuracy of 99.29%, sensitivity of 92.79%, and specificity of 100% were obtained with the dataset from Kaggle (https://www.kaggle.com/einsteindata4u/covid19) after applying optimization to SVM. The same procedure and work were performed with the dataset taken from San Raffaele Hospital (https://zenodo.org/record/3886927#.YIluB5AzbMV). Once more, the SVM presented the best performance among other machine learning algorithms, and 92.86%, 93.55%, and 90.91% for accuracy, sensitivity, and specificity, respectively, were obtained. CONCLUSION: The obtained results, when compared with others from the literature based on these same datasets, are superior, leading us to conclude that our proposed solution is reliable for the COVID-19 diagnosis.
format Online
Article
Text
id pubmed-8629644
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-86296442021-11-30 An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection Tchagna Kouanou, Aurelle Mih Attia, Thomas Feudjio, Cyrille Djeumo, Anges Fleurio Ngo Mouelas, Adèle Nzogang, Mendel Patrice Tchito Tchapga, Christian Tchiotsop, Daniel J Healthc Eng Review Article METHODS: Our analysis and machine learning algorithm is based on most cited two clinical datasets from the literature: one from San Raffaele Hospital Milan Italia and the other from Hospital Israelita Albert Einstein São Paulo Brasilia. The datasets were processed to select the best features that most influence the target, and it turned out that almost all of them are blood parameters. EDA (Exploratory Data Analysis) methods were applied to the datasets, and a comparative study of supervised machine learning models was done, after which the support vector machine (SVM) was selected as the one with the best performance. RESULTS: SVM being the best performant is used as our proposed supervised machine learning algorithm. An accuracy of 99.29%, sensitivity of 92.79%, and specificity of 100% were obtained with the dataset from Kaggle (https://www.kaggle.com/einsteindata4u/covid19) after applying optimization to SVM. The same procedure and work were performed with the dataset taken from San Raffaele Hospital (https://zenodo.org/record/3886927#.YIluB5AzbMV). Once more, the SVM presented the best performance among other machine learning algorithms, and 92.86%, 93.55%, and 90.91% for accuracy, sensitivity, and specificity, respectively, were obtained. CONCLUSION: The obtained results, when compared with others from the literature based on these same datasets, are superior, leading us to conclude that our proposed solution is reliable for the COVID-19 diagnosis. Hindawi 2021-11-22 /pmc/articles/PMC8629644/ /pubmed/34853669 http://dx.doi.org/10.1155/2021/4733167 Text en Copyright © 2021 Aurelle Tchagna Kouanou et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Review Article
Tchagna Kouanou, Aurelle
Mih Attia, Thomas
Feudjio, Cyrille
Djeumo, Anges Fleurio
Ngo Mouelas, Adèle
Nzogang, Mendel Patrice
Tchito Tchapga, Christian
Tchiotsop, Daniel
An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
title An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
title_full An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
title_fullStr An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
title_full_unstemmed An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
title_short An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection
title_sort overview of supervised machine learning methods and data analysis for covid-19 detection
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8629644/
https://www.ncbi.nlm.nih.gov/pubmed/34853669
http://dx.doi.org/10.1155/2021/4733167
work_keys_str_mv AT tchagnakouanouaurelle anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT mihattiathomas anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT feudjiocyrille anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT djeumoangesfleurio anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT ngomouelasadele anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT nzogangmendelpatrice anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT tchitotchapgachristian anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT tchiotsopdaniel anoverviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT tchagnakouanouaurelle overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT mihattiathomas overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT feudjiocyrille overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT djeumoangesfleurio overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT ngomouelasadele overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT nzogangmendelpatrice overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT tchitotchapgachristian overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection
AT tchiotsopdaniel overviewofsupervisedmachinelearningmethodsanddataanalysisforcovid19detection