Cargando…

Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis

Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new...

Descripción completa

Detalles Bibliográficos
Autores principales: Wamil, Malgorzata, Hassaine, Abdelaali, Rao, Shishir, Li, Yikuan, Mamouei, Mohammad, Canoy, Dexter, Nazarzadeh, Milad, Bidel, Zeinab, Copland, Emma, Rahimi, Kazem, Salimi-Khorshidi, Gholamreza
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10350454/
https://www.ncbi.nlm.nih.gov/pubmed/37455284
http://dx.doi.org/10.1038/s41598-023-38251-1
_version_ 1785074138951450624
author Wamil, Malgorzata
Hassaine, Abdelaali
Rao, Shishir
Li, Yikuan
Mamouei, Mohammad
Canoy, Dexter
Nazarzadeh, Milad
Bidel, Zeinab
Copland, Emma
Rahimi, Kazem
Salimi-Khorshidi, Gholamreza
author_facet Wamil, Malgorzata
Hassaine, Abdelaali
Rao, Shishir
Li, Yikuan
Mamouei, Mohammad
Canoy, Dexter
Nazarzadeh, Milad
Bidel, Zeinab
Copland, Emma
Rahimi, Kazem
Salimi-Khorshidi, Gholamreza
author_sort Wamil, Malgorzata
collection PubMed
description Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new diagnosis of diabetes were extracted from Clinical Practice Research Datalink. First, using BEHRT (a transformer-based deep learning architecture), the embeddings corresponding to diabetes were learned. Next, topological data analysis (TDA) was carried out to test how different areas in high-dimensional manifold correspond to different risk profiles. The following endpoints were considered when profiling risk trajectories: major adverse cardiovascular events (MACE), coronary artery disease (CAD), stroke (CVA), heart failure (HF), renal failure (RF), diabetic neuropathy, peripheral arterial disease, reduced visual acuity and all-cause mortality. Kaplan Meier curves were plotted for each derived phenotype. Finally, we tested the performance of an established risk prediction model (QRISK) by adding TDA-derived features. We identified four subgroups of patients with diabetes and divergent comorbidity patterns differing in their risk of future cardiovascular, renal, and other microvascular outcomes. Phenotype 1 (young with chronic inflammatory conditions) and phenotype 2 (young with CAD) included relatively younger patients with diabetes compared to phenotypes 3 (older with hypertension and renal disease) and 4 (older with previous CVA), and those subgroups had a higher frequency of pre-existing cardio-renal diseases. Within ten years of follow-up, 2592 patients (26%) experienced MACE, 2515 patients (25%) died, and 2020 patients (20%) suffered RF. QRISK3 model’s AUC was augmented from 67.26% (CI 67.25–67.28%) to 67.67% (CI 67.66–67.69%) by adding specific TDA-derived phenotype and the distances to both extremities of the TDA graph improving its performance in the prediction of CV outcomes. We confirmed the importance of accounting for multimorbidity when risk stratifying heterogenous cohort of patients with new diagnosis of diabetes. Our unsupervised machine learning method improved the prediction of clinical outcomes.
format Online
Article
Text
id pubmed-10350454
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103504542023-07-18 Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis Wamil, Malgorzata Hassaine, Abdelaali Rao, Shishir Li, Yikuan Mamouei, Mohammad Canoy, Dexter Nazarzadeh, Milad Bidel, Zeinab Copland, Emma Rahimi, Kazem Salimi-Khorshidi, Gholamreza Sci Rep Article Diabetes is a heterogenous, multimorbid disorder with a large variation in manifestations, trajectories, and outcomes. The aim of this study is to validate a novel machine learning method for the phenotyping of diabetes in the context of comorbidities. Data from 9967 multimorbid patients with a new diagnosis of diabetes were extracted from Clinical Practice Research Datalink. First, using BEHRT (a transformer-based deep learning architecture), the embeddings corresponding to diabetes were learned. Next, topological data analysis (TDA) was carried out to test how different areas in high-dimensional manifold correspond to different risk profiles. The following endpoints were considered when profiling risk trajectories: major adverse cardiovascular events (MACE), coronary artery disease (CAD), stroke (CVA), heart failure (HF), renal failure (RF), diabetic neuropathy, peripheral arterial disease, reduced visual acuity and all-cause mortality. Kaplan Meier curves were plotted for each derived phenotype. Finally, we tested the performance of an established risk prediction model (QRISK) by adding TDA-derived features. We identified four subgroups of patients with diabetes and divergent comorbidity patterns differing in their risk of future cardiovascular, renal, and other microvascular outcomes. Phenotype 1 (young with chronic inflammatory conditions) and phenotype 2 (young with CAD) included relatively younger patients with diabetes compared to phenotypes 3 (older with hypertension and renal disease) and 4 (older with previous CVA), and those subgroups had a higher frequency of pre-existing cardio-renal diseases. Within ten years of follow-up, 2592 patients (26%) experienced MACE, 2515 patients (25%) died, and 2020 patients (20%) suffered RF. QRISK3 model’s AUC was augmented from 67.26% (CI 67.25–67.28%) to 67.67% (CI 67.66–67.69%) by adding specific TDA-derived phenotype and the distances to both extremities of the TDA graph improving its performance in the prediction of CV outcomes. We confirmed the importance of accounting for multimorbidity when risk stratifying heterogenous cohort of patients with new diagnosis of diabetes. Our unsupervised machine learning method improved the prediction of clinical outcomes. Nature Publishing Group UK 2023-07-16 /pmc/articles/PMC10350454/ /pubmed/37455284 http://dx.doi.org/10.1038/s41598-023-38251-1 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Wamil, Malgorzata
Hassaine, Abdelaali
Rao, Shishir
Li, Yikuan
Mamouei, Mohammad
Canoy, Dexter
Nazarzadeh, Milad
Bidel, Zeinab
Copland, Emma
Rahimi, Kazem
Salimi-Khorshidi, Gholamreza
Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
title Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
title_full Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
title_fullStr Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
title_full_unstemmed Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
title_short Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
title_sort stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10350454/
https://www.ncbi.nlm.nih.gov/pubmed/37455284
http://dx.doi.org/10.1038/s41598-023-38251-1
work_keys_str_mv AT wamilmalgorzata stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT hassaineabdelaali stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT raoshishir stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT liyikuan stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT mamoueimohammad stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT canoydexter stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT nazarzadehmilad stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT bidelzeinab stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT coplandemma stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT rahimikazem stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis
AT salimikhorshidigholamreza stratificationofdiabetesinthecontextofcomorbiditiesusingrepresentationlearningandtopologicaldataanalysis