Cargando…
Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany
(1) In the present study, we used data comprising patient medical histories from a panel of primary care practices in Germany to predict post-COVID-19 conditions in patients after COVID-19 diagnosis and to evaluate the relevant factors associated with these conditions using machine learning methods....
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10219004/ https://www.ncbi.nlm.nih.gov/pubmed/37240616 http://dx.doi.org/10.3390/jcm12103511 |
_version_ | 1785048907292606464 |
---|---|
author | Kessler, Roman Philipp, Jos Wilfer, Joanna Kostev, Karel |
author_facet | Kessler, Roman Philipp, Jos Wilfer, Joanna Kostev, Karel |
author_sort | Kessler, Roman |
collection | PubMed |
description | (1) In the present study, we used data comprising patient medical histories from a panel of primary care practices in Germany to predict post-COVID-19 conditions in patients after COVID-19 diagnosis and to evaluate the relevant factors associated with these conditions using machine learning methods. (2) Methods: Data retrieved from the IQVIA(TM) Disease Analyzer database were used. Patients with at least one COVID-19 diagnosis between January 2020 and July 2022 were selected for inclusion in the study. Age, sex, and the complete history of diagnoses and prescription data before COVID-19 infection at the respective primary care practice were extracted for each patient. A gradient boosting classifier (LGBM) was deployed. The prepared design matrix was randomly divided into train (80%) and test data (20%). After optimizing the hyperparameters of the LGBM classifier by maximizing the F2 score, model performance was evaluated using several test metrics. We calculated SHAP values to evaluate the importance of the individual features, but more importantly, to evaluate the direction of influence of each feature in our dataset, i.e., whether it is positively or negatively associated with a diagnosis of long COVID. (3) Results: In both the train and test data sets, the model showed a high recall (sensitivity) of 81% and 72% and a high specificity of 80% and 80%; this was offset, however, by a moderate precision of 8% and 7% and an F2-score of 0.28 and 0.25. The most common predictive features identified using SHAP included COVID-19 variant, physician practice, age, distinct number of diagnoses and therapies, sick days ratio, sex, vaccination rate, somatoform disorders, migraine, back pain, asthma, malaise and fatigue, as well as cough preparations. (4) Conclusions: The present exploratory study describes an initial investigation of the prediction of potential features increasing the risk of developing long COVID after COVID-19 infection by using the patient history from electronic medical records before COVID-19 infection in primary care practices in Germany using machine learning. Notably, we identified several predictive features for the development of long COVID in patient demographics and their medical histories. |
format | Online Article Text |
id | pubmed-10219004 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-102190042023-05-27 Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany Kessler, Roman Philipp, Jos Wilfer, Joanna Kostev, Karel J Clin Med Article (1) In the present study, we used data comprising patient medical histories from a panel of primary care practices in Germany to predict post-COVID-19 conditions in patients after COVID-19 diagnosis and to evaluate the relevant factors associated with these conditions using machine learning methods. (2) Methods: Data retrieved from the IQVIA(TM) Disease Analyzer database were used. Patients with at least one COVID-19 diagnosis between January 2020 and July 2022 were selected for inclusion in the study. Age, sex, and the complete history of diagnoses and prescription data before COVID-19 infection at the respective primary care practice were extracted for each patient. A gradient boosting classifier (LGBM) was deployed. The prepared design matrix was randomly divided into train (80%) and test data (20%). After optimizing the hyperparameters of the LGBM classifier by maximizing the F2 score, model performance was evaluated using several test metrics. We calculated SHAP values to evaluate the importance of the individual features, but more importantly, to evaluate the direction of influence of each feature in our dataset, i.e., whether it is positively or negatively associated with a diagnosis of long COVID. (3) Results: In both the train and test data sets, the model showed a high recall (sensitivity) of 81% and 72% and a high specificity of 80% and 80%; this was offset, however, by a moderate precision of 8% and 7% and an F2-score of 0.28 and 0.25. The most common predictive features identified using SHAP included COVID-19 variant, physician practice, age, distinct number of diagnoses and therapies, sick days ratio, sex, vaccination rate, somatoform disorders, migraine, back pain, asthma, malaise and fatigue, as well as cough preparations. (4) Conclusions: The present exploratory study describes an initial investigation of the prediction of potential features increasing the risk of developing long COVID after COVID-19 infection by using the patient history from electronic medical records before COVID-19 infection in primary care practices in Germany using machine learning. Notably, we identified several predictive features for the development of long COVID in patient demographics and their medical histories. MDPI 2023-05-17 /pmc/articles/PMC10219004/ /pubmed/37240616 http://dx.doi.org/10.3390/jcm12103511 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Kessler, Roman Philipp, Jos Wilfer, Joanna Kostev, Karel Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany |
title | Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany |
title_full | Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany |
title_fullStr | Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany |
title_full_unstemmed | Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany |
title_short | Predictive Attributes for Developing Long COVID—A Study Using Machine Learning and Real-World Data from Primary Care Physicians in Germany |
title_sort | predictive attributes for developing long covid—a study using machine learning and real-world data from primary care physicians in germany |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10219004/ https://www.ncbi.nlm.nih.gov/pubmed/37240616 http://dx.doi.org/10.3390/jcm12103511 |
work_keys_str_mv | AT kesslerroman predictiveattributesfordevelopinglongcovidastudyusingmachinelearningandrealworlddatafromprimarycarephysiciansingermany AT philippjos predictiveattributesfordevelopinglongcovidastudyusingmachinelearningandrealworlddatafromprimarycarephysiciansingermany AT wilferjoanna predictiveattributesfordevelopinglongcovidastudyusingmachinelearningandrealworlddatafromprimarycarephysiciansingermany AT kostevkarel predictiveattributesfordevelopinglongcovidastudyusingmachinelearningandrealworlddatafromprimarycarephysiciansingermany |