Cargando…
Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10350454/ https://www.ncbi.nlm.nih.gov/pubmed/37455284 http://dx.doi.org/10.1038/s41598-023-38251-1 |
_version_ | 1785074138951450624 |
---|---|
author | Wamil, Malgorzata Hassaine, Abdelaali Rao, Shishir Li, Yikuan Mamouei, Mohammad Canoy, Dexter Nazarzadeh, Milad Bidel, Zeinab Copland, Emma Rahimi, Kazem Salimi-Khorshidi, Gholamreza |
author_facet | Wamil, Malgorzata Hassaine, Abdelaali Rao, Shishir Li, Yikuan Mamouei, Mohammad Canoy, Dexter Nazarzadeh, Milad Bidel, Zeinab Copland, Emma Rahimi, Kazem Salimi-Khorshidi, Gholamreza |
author_sort | Wamil, Malgorzata |
collection | PubMed |
description | Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new diagnosis of diabetes were extracted from Clinical Practice Research Datalink. First, using BEHRT (a transformer-based deep learning architecture), the embeddings corresponding to diabetes were learned. Next, topological data analysis (TDA) was carried out to test how different areas in high-dimensional manifold correspond to different risk profiles. The following endpoints were considered when profiling risk trajectories: major adverse cardiovascular events (MACE), coronary artery disease (CAD), stroke (CVA), heart failure (HF), renal failure (RF), diabetic neuropathy, peripheral arterial disease, reduced visual acuity and all-cause mortality. Kaplan Meier curves were plotted for each derived phenotype. Finally, we tested the performance of an established risk prediction model (QRISK) by adding TDA-derived features. We identified four subgroups of patients with diabetes and divergent comorbidity patterns differing in their risk of future cardiovascular, renal, and other microvascular outcomes. Phenotype 1 (young with chronic inflammatory conditions) and phenotype 2 (young with CAD) included relatively younger patients with diabetes compared to phenotypes 3 (older with hypertension and renal disease) and 4 (older with previous CVA), and those subgroups had a higher frequency of pre-existing cardio-renal diseases. Within ten years of follow-up, 2592 patients (26%) experienced MACE, 2515 patients (25%) died, and 2020 patients (20%) suffered RF. QRISK3 model’s AUC was augmented from 67.26% (CI 67.25–67.28%) to 67.67% (CI 67.66–67.69%) by adding specific TDA-derived phenotype and the distances to both extremities of the TDA graph improving its performance in the prediction of CV outcomes. We confirmed the importance of accounting for multimorbidity when risk stratifying heterogenous cohort of patients with new diagnosis of diabetes. Our unsupervised machine learning method improved the prediction of clinical outcomes. |
format | Online Article Text |
id | pubmed-10350454 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-103504542023-07-18 Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis Wamil, Malgorzata Hassaine, Abdelaali Rao, Shishir Li, Yikuan Mamouei, Mohammad Canoy, Dexter Nazarzadeh, Milad Bidel, Zeinab Copland, Emma Rahimi, Kazem Salimi-Khorshidi, Gholamreza Sci Rep Article Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new diagnosis of diabetes were extracted from Clinical Practice Research Datalink. First, using BEHRT (a transformer-based deep learning architecture), the embeddings corresponding to diabetes were learned. Next, topological data analysis (TDA) was carried out to test how different areas in high-dimensional manifold correspond to different risk profiles. The following endpoints were considered when profiling risk trajectories: major adverse cardiovascular events (MACE), coronary artery disease (CAD), stroke (CVA), heart failure (HF), renal failure (RF), diabetic neuropathy, peripheral arterial disease, reduced visual acuity and all-cause mortality. Kaplan Meier curves were plotted for each derived phenotype. Finally, we tested the performance of an established risk prediction model (QRISK) by adding TDA-derived features. We identified four subgroups of patients with diabetes and divergent comorbidity patterns differing in their risk of future cardiovascular, renal, and other microvascular outcomes. Phenotype 1 (young with chronic inflammatory conditions) and phenotype 2 (young with CAD) included relatively younger patients with diabetes compared to phenotypes 3 (older with hypertension and renal disease) and 4 (older with previous CVA), and those subgroups had a higher frequency of pre-existing cardio-renal diseases. Within ten years of follow-up, 2592 patients (26%) experienced MACE, 2515 patients (25%) died, and 2020 patients (20%) suffered RF. QRISK3 model’s AUC was augmented from 67.26% (CI 67.25–67.28%) to 67.67% (CI 67.66–67.69%) by adding specific TDA-derived phenotype and the distances to both extremities of the TDA graph improving its performance in the prediction of CV outcomes. We confirmed the importance of accounting for multimorbidity when risk stratifying heterogenous cohort of patients with new diagnosis of diabetes. Our unsupervised machine learning method improved the prediction of clinical outcomes. Nature Publishing Group UK 2023-07-16 /pmc/articles/PMC10350454/ /pubmed/37455284 http://dx.doi.org/10.1038/s41598-023-38251-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Wamil, Malgorzata Hassaine, Abdelaali Rao, Shishir Li, Yikuan Mamouei, Mohammad Canoy, Dexter Nazarzadeh, Milad Bidel, Zeinab Copland, Emma Rahimi, Kazem Salimi-Khorshidi, Gholamreza Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
title | Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
title_full | Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
title_fullStr | Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
title_full_unstemmed | Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
title_short | Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
title_sort | stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10350454/ https://www.ncbi.nlm.nih.gov/pubmed/37455284 http://dx.doi.org/10.1038/s41598-023-38251-1 |
work_keys_str_mv | AT wamilmalgorzata stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT hassaineabdelaali stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT raoshishir stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT liyikuan stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT mamoueimohammad stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT canoydexter stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT nazarzadehmilad stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT bidelzeinab stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT coplandemma stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT rahimikazem stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis AT salimikhorshidigholamreza stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis |