Cargando…

Graph data science and machine learning for the detection of COVID-19 infection from symptoms

BACKGROUND: COVID-19 is an infectious disease caused by SARS-CoV-2. The symptoms of COVID-19 vary from mild-to-moderate respiratory illnesses, and it sometimes requires urgent medication. Therefore, it is crucial to detect COVID-19 at an early stage through specific clinical tests, testing kits, and...

Descripción completa

Detalles Bibliográficos
Autores principales: Alqaissi, Eman, Alotaibi, Fahd, Ramzan, Muhammad Sher
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280642/
https://www.ncbi.nlm.nih.gov/pubmed/37346701
http://dx.doi.org/10.7717/peerj-cs.1333
_version_ 1785060842541154304
author Alqaissi, Eman
Alotaibi, Fahd
Ramzan, Muhammad Sher
author_facet Alqaissi, Eman
Alotaibi, Fahd
Ramzan, Muhammad Sher
author_sort Alqaissi, Eman
collection PubMed
description BACKGROUND: COVID-19 is an infectious disease caused by SARS-CoV-2. The symptoms of COVID-19 vary from mild-to-moderate respiratory illnesses, and it sometimes requires urgent medication. Therefore, it is crucial to detect COVID-19 at an early stage through specific clinical tests, testing kits, and medical devices. However, these tests are not always available during the time of the pandemic. Therefore, this study developed an automatic, intelligent, rapid, and real-time diagnostic model for the early detection of COVID-19 based on its symptoms. METHODS: The COVID-19 knowledge graph (KG) constructed based on literature from heterogeneous data is imported to understand the COVID-19 different relations. We added human disease ontology to the COVID-19 KG and applied a node-embedding graph algorithm called fast random projection to extract an extra feature from the COVID-19 dataset. Subsequently, experiments were conducted using two machine learning (ML) pipelines to predict COVID-19 infection from its symptoms. Additionally, automatic tuning of the model hyperparameters was adopted. RESULTS: We compared two graph-based ML models, logistic regression (LR) and random forest (RF) models. The proposed graph-based RF model achieved a small error rate = 0.0064 and the best scores on all performance metrics, including specificity = 98.71%, accuracy = 99.36%, precision = 99.65%, recall = 99.53%, and F1-score = 99.59%. Furthermore, the Matthews correlation coefficient achieved by the RF model was higher than that of the LR model. Comparative analysis with other ML algorithms and with studies from the literature showed that the proposed RF model exhibited the best detection accuracy. CONCLUSION: The graph-based RF model registered high performance in classifying the symptoms of COVID-19 infection, thereby indicating that the graph data science, in conjunction with ML techniques, helps improve performance and accelerate innovations.
format Online
Article
Text
id pubmed-10280642
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-102806422023-06-21 Graph data science and machine learning for the detection of COVID-19 infection from symptoms Alqaissi, Eman Alotaibi, Fahd Ramzan, Muhammad Sher PeerJ Comput Sci Bioinformatics BACKGROUND: COVID-19 is an infectious disease caused by SARS-CoV-2. The symptoms of COVID-19 vary from mild-to-moderate respiratory illnesses, and it sometimes requires urgent medication. Therefore, it is crucial to detect COVID-19 at an early stage through specific clinical tests, testing kits, and medical devices. However, these tests are not always available during the time of the pandemic. Therefore, this study developed an automatic, intelligent, rapid, and real-time diagnostic model for the early detection of COVID-19 based on its symptoms. METHODS: The COVID-19 knowledge graph (KG) constructed based on literature from heterogeneous data is imported to understand the COVID-19 different relations. We added human disease ontology to the COVID-19 KG and applied a node-embedding graph algorithm called fast random projection to extract an extra feature from the COVID-19 dataset. Subsequently, experiments were conducted using two machine learning (ML) pipelines to predict COVID-19 infection from its symptoms. Additionally, automatic tuning of the model hyperparameters was adopted. RESULTS: We compared two graph-based ML models, logistic regression (LR) and random forest (RF) models. The proposed graph-based RF model achieved a small error rate = 0.0064 and the best scores on all performance metrics, including specificity = 98.71%, accuracy = 99.36%, precision = 99.65%, recall = 99.53%, and F1-score = 99.59%. Furthermore, the Matthews correlation coefficient achieved by the RF model was higher than that of the LR model. Comparative analysis with other ML algorithms and with studies from the literature showed that the proposed RF model exhibited the best detection accuracy. CONCLUSION: The graph-based RF model registered high performance in classifying the symptoms of COVID-19 infection, thereby indicating that the graph data science, in conjunction with ML techniques, helps improve performance and accelerate innovations. PeerJ Inc. 2023-04-10 /pmc/articles/PMC10280642/ /pubmed/37346701 http://dx.doi.org/10.7717/peerj-cs.1333 Text en © 2023 Alqaissi et al. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by-nc/4.0/) , which permits using, remixing, and building upon the work non-commercially, as long as it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Alqaissi, Eman
Alotaibi, Fahd
Ramzan, Muhammad Sher
Graph data science and machine learning for the detection of COVID-19 infection from symptoms
title Graph data science and machine learning for the detection of COVID-19 infection from symptoms
title_full Graph data science and machine learning for the detection of COVID-19 infection from symptoms
title_fullStr Graph data science and machine learning for the detection of COVID-19 infection from symptoms
title_full_unstemmed Graph data science and machine learning for the detection of COVID-19 infection from symptoms
title_short Graph data science and machine learning for the detection of COVID-19 infection from symptoms
title_sort graph data science and machine learning for the detection of covid-19 infection from symptoms
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10280642/
https://www.ncbi.nlm.nih.gov/pubmed/37346701
http://dx.doi.org/10.7717/peerj-cs.1333
work_keys_str_mv AT alqaissieman graphdatascienceandmachinelearningforthedetectionofcovid19infectionfromsymptoms
AT alotaibifahd graphdatascienceandmachinelearningforthedetectionofcovid19infectionfromsymptoms
AT ramzanmuhammadsher graphdatascienceandmachinelearningforthedetectionofcovid19infectionfromsymptoms