Cargando…

Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning

BACKGROUND: COVID-19 vaccines offer different levels of immune protection but do not provide 100% protection. Vaccinated persons with pre-existing comorbidities may be at an increased risk of SARS-CoV-2 breakthrough infection or reinfection. The aim of this study is to identify the critical variable...

Descripción completa

Detalles Bibliográficos
Autores principales: Daramola, Olawande, Kavu, Tatenda Duncan, Kotze, Maritha J, Kamati, Oiva, Emjedi, Zaakiyah, Kabaso, Boniface, Moser, Thomas, Stroetmann, Karl, Fwemba, Isaac, Daramola, Fisayo, Nyirenda, Martha, van Rensburg, Susan J, Nyasulu, Peter S, Marnewick, Jeanine L
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10627023/
https://www.ncbi.nlm.nih.gov/pubmed/37936960
http://dx.doi.org/10.1177/20552076231207593
_version_ 1785131454109319168
author Daramola, Olawande
Kavu, Tatenda Duncan
Kotze, Maritha J
Kamati, Oiva
Emjedi, Zaakiyah
Kabaso, Boniface
Moser, Thomas
Stroetmann, Karl
Fwemba, Isaac
Daramola, Fisayo
Nyirenda, Martha
van Rensburg, Susan J
Nyasulu, Peter S
Marnewick, Jeanine L
author_facet Daramola, Olawande
Kavu, Tatenda Duncan
Kotze, Maritha J
Kamati, Oiva
Emjedi, Zaakiyah
Kabaso, Boniface
Moser, Thomas
Stroetmann, Karl
Fwemba, Isaac
Daramola, Fisayo
Nyirenda, Martha
van Rensburg, Susan J
Nyasulu, Peter S
Marnewick, Jeanine L
author_sort Daramola, Olawande
collection PubMed
description BACKGROUND: COVID-19 vaccines offer different levels of immune protection but do not provide 100% protection. Vaccinated persons with pre-existing comorbidities may be at an increased risk of SARS-CoV-2 breakthrough infection or reinfection. The aim of this study is to identify the critical variables associated with a higher probability of SARS-CoV-2 breakthrough infection using machine learning. METHODS: A dataset comprising symptoms and feedback from 257 persons, of whom 203 were vaccinated and 54 unvaccinated, was used for the investigation. Three machine learning algorithms – Deep Multilayer Perceptron (Deep MLP), XGBoost, and Logistic Regression – were trained with the original (imbalanced) dataset and the balanced dataset created by using the Random Oversampling Technique (ROT), and the Synthetic Minority Oversampling Technique (SMOTE). We compared the performance of the classification algorithms when the features highly correlated with breakthrough infection were used and when all features in the dataset were used. RESULT: The results show that when highly correlated features were considered as predictors, with Random Oversampling to address data imbalance, the XGBoost classifier has the best performance (F1 = 0.96; accuracy = 0.96; AUC = 0.98; G-Mean = 0.98; MCC = 0.88). The Deep MLP had the second best performance (F1 = 0.94; accuracy = 0.94; AUC = 0.92; G-Mean = 0.70; MCC = 0.42), while Logistic Regression had less accurate performance (F1 = 0.89; accuracy = 0.88; AUC = 0.89; G-Mean = 0.89; MCC = 0.68). We also used Shapley Additive Explanations (SHAP) to investigate the interpretability of the models. We found that body temperature, total cholesterol, glucose level, blood pressure, waist circumference, body weight, body mass index (BMI), haemoglobin level, and physical activity per week are the most critical variables indicating a higher risk of breakthrough infection. CONCLUSION: These results, evident from our unique data source derived from apparently healthy volunteers with cardiovascular risk factors, follow the expected pattern of positive or negative correlations previously reported in the literature. This information strengthens the body of knowledge currently applied in public health guidelines and may also be used by medical practitioners in the future to reduce the risk of SARS-CoV-2 breakthrough infection.
format Online
Article
Text
id pubmed-10627023
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-106270232023-11-07 Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning Daramola, Olawande Kavu, Tatenda Duncan Kotze, Maritha J Kamati, Oiva Emjedi, Zaakiyah Kabaso, Boniface Moser, Thomas Stroetmann, Karl Fwemba, Isaac Daramola, Fisayo Nyirenda, Martha van Rensburg, Susan J Nyasulu, Peter S Marnewick, Jeanine L Digit Health Original Research BACKGROUND: COVID-19 vaccines offer different levels of immune protection but do not provide 100% protection. Vaccinated persons with pre-existing comorbidities may be at an increased risk of SARS-CoV-2 breakthrough infection or reinfection. The aim of this study is to identify the critical variables associated with a higher probability of SARS-CoV-2 breakthrough infection using machine learning. METHODS: A dataset comprising symptoms and feedback from 257 persons, of whom 203 were vaccinated and 54 unvaccinated, was used for the investigation. Three machine learning algorithms – Deep Multilayer Perceptron (Deep MLP), XGBoost, and Logistic Regression – were trained with the original (imbalanced) dataset and the balanced dataset created by using the Random Oversampling Technique (ROT), and the Synthetic Minority Oversampling Technique (SMOTE). We compared the performance of the classification algorithms when the features highly correlated with breakthrough infection were used and when all features in the dataset were used. RESULT: The results show that when highly correlated features were considered as predictors, with Random Oversampling to address data imbalance, the XGBoost classifier has the best performance (F1 = 0.96; accuracy = 0.96; AUC = 0.98; G-Mean = 0.98; MCC = 0.88). The Deep MLP had the second best performance (F1 = 0.94; accuracy = 0.94; AUC = 0.92; G-Mean = 0.70; MCC = 0.42), while Logistic Regression had less accurate performance (F1 = 0.89; accuracy = 0.88; AUC = 0.89; G-Mean = 0.89; MCC = 0.68). We also used Shapley Additive Explanations (SHAP) to investigate the interpretability of the models. We found that body temperature, total cholesterol, glucose level, blood pressure, waist circumference, body weight, body mass index (BMI), haemoglobin level, and physical activity per week are the most critical variables indicating a higher risk of breakthrough infection. CONCLUSION: These results, evident from our unique data source derived from apparently healthy volunteers with cardiovascular risk factors, follow the expected pattern of positive or negative correlations previously reported in the literature. This information strengthens the body of knowledge currently applied in public health guidelines and may also be used by medical practitioners in the future to reduce the risk of SARS-CoV-2 breakthrough infection. SAGE Publications 2023-11-05 /pmc/articles/PMC10627023/ /pubmed/37936960 http://dx.doi.org/10.1177/20552076231207593 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Research
Daramola, Olawande
Kavu, Tatenda Duncan
Kotze, Maritha J
Kamati, Oiva
Emjedi, Zaakiyah
Kabaso, Boniface
Moser, Thomas
Stroetmann, Karl
Fwemba, Isaac
Daramola, Fisayo
Nyirenda, Martha
van Rensburg, Susan J
Nyasulu, Peter S
Marnewick, Jeanine L
Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning
title Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning
title_full Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning
title_fullStr Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning
title_full_unstemmed Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning
title_short Detecting the most critical clinical variables of COVID-19 breakthrough infection in vaccinated persons using machine learning
title_sort detecting the most critical clinical variables of covid-19 breakthrough infection in vaccinated persons using machine learning
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10627023/
https://www.ncbi.nlm.nih.gov/pubmed/37936960
http://dx.doi.org/10.1177/20552076231207593
work_keys_str_mv AT daramolaolawande detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT kavutatendaduncan detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT kotzemarithaj detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT kamatioiva detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT emjedizaakiyah detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT kabasoboniface detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT moserthomas detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT stroetmannkarl detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT fwembaisaac detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT daramolafisayo detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT nyirendamartha detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT vanrensburgsusanj detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT nyasulupeters detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning
AT marnewickjeaninel detectingthemostcriticalclinicalvariablesofcovid19breakthroughinfectioninvaccinatedpersonsusingmachinelearning