Cargando…

Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records

INTRODUCTION: We sought to assess longitudinal electronic health records (EHRs) using machine learning (ML) methods to computationally derive probable Alzheimer's Disease (AD) and related dementia subphenotypes. METHODS: A retrospective analysis of EHR data from a cohort of 7587 patients seen a...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Jie, Wang, Fei, Xu, Zhenxing, Adekkanattu, Prakash, Brandt, Pascal, Jiang, Guoqian, Kiefer, Richard C., Luo, Yuan, Mao, Chengsheng, Pacheco, Jennifer A., Rasmussen, Luke V., Zhang, Yiye, Isaacson, Richard, Pathak, Jyotishman
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7556420/
https://www.ncbi.nlm.nih.gov/pubmed/33083543
http://dx.doi.org/10.1002/lrh2.10246
_version_ 1783594212367794176
author Xu, Jie
Wang, Fei
Xu, Zhenxing
Adekkanattu, Prakash
Brandt, Pascal
Jiang, Guoqian
Kiefer, Richard C.
Luo, Yuan
Mao, Chengsheng
Pacheco, Jennifer A.
Rasmussen, Luke V.
Zhang, Yiye
Isaacson, Richard
Pathak, Jyotishman
author_facet Xu, Jie
Wang, Fei
Xu, Zhenxing
Adekkanattu, Prakash
Brandt, Pascal
Jiang, Guoqian
Kiefer, Richard C.
Luo, Yuan
Mao, Chengsheng
Pacheco, Jennifer A.
Rasmussen, Luke V.
Zhang, Yiye
Isaacson, Richard
Pathak, Jyotishman
author_sort Xu, Jie
collection PubMed
description INTRODUCTION: We sought to assess longitudinal electronic health records (EHRs) using machine learning (ML) methods to computationally derive probable Alzheimer's Disease (AD) and related dementia subphenotypes. METHODS: A retrospective analysis of EHR data from a cohort of 7587 patients seen at a large, multi‐specialty urban academic medical center in New York was conducted. Subphenotypes were derived using hierarchical clustering from 792 probable AD patients (cases) who had received at least one diagnosis of AD using their clinical data. The other 6795 patients, labeled as controls, were matched on age and gender with the cases and randomly selected in the ratio of 9:1. Prediction models with multiple ML algorithms were trained on this cohort using 5‐fold cross‐validation. XGBoost was used to rank the variable importance. RESULTS: Four subphenotypes were computationally derived. Subphenotype A (n = 273; 28.2%) had more patients with cardiovascular diseases; subphenotype B (n = 221; 27.9%) had more patients with mental health illnesses, such as depression and anxiety; patients in subphenotype C (n = 183; 23.1%) were overall older (mean (SD) age, 79.5 (5.4) years) and had the most comorbidities including diabetes, cardiovascular diseases, and mental health disorders; and subphenotype D (n = 115; 14.5%) included patients who took anti‐dementia drugs and had sensory problems, such as deafness and hearing impairment. The 0‐year prediction model for AD risk achieved an area under the receiver operating curve (AUC) of 0.764 (SD: 0.02); the 6‐month model, 0.751 (SD: 0.02); the 1‐year model, 0.752 (SD: 0.02); the 2‐year model, 0.749 (SD: 0.03); and the 3‐year model, 0.735 (SD: 0.03), respectively. Based on variable importance, the top‐ranked comorbidities included depression, stroke/transient ischemic attack, hypertension, anxiety, mobility impairments, and atrial fibrillation. The top‐ranked medications included anti‐dementia drugs, antipsychotics, antiepileptics, and antidepressants. CONCLUSIONS: Four subphenotypes were computationally derived that correlated with cardiovascular diseases and mental health illnesses. ML algorithms based on patient demographics, diagnosis, and treatment demonstrated promising results in predicting the risk of developing AD at different time points across an individual's lifespan.
format Online
Article
Text
id pubmed-7556420
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-75564202020-10-19 Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records Xu, Jie Wang, Fei Xu, Zhenxing Adekkanattu, Prakash Brandt, Pascal Jiang, Guoqian Kiefer, Richard C. Luo, Yuan Mao, Chengsheng Pacheco, Jennifer A. Rasmussen, Luke V. Zhang, Yiye Isaacson, Richard Pathak, Jyotishman Learn Health Syst Research Reports INTRODUCTION: We sought to assess longitudinal electronic health records (EHRs) using machine learning (ML) methods to computationally derive probable Alzheimer's Disease (AD) and related dementia subphenotypes. METHODS: A retrospective analysis of EHR data from a cohort of 7587 patients seen at a large, multi‐specialty urban academic medical center in New York was conducted. Subphenotypes were derived using hierarchical clustering from 792 probable AD patients (cases) who had received at least one diagnosis of AD using their clinical data. The other 6795 patients, labeled as controls, were matched on age and gender with the cases and randomly selected in the ratio of 9:1. Prediction models with multiple ML algorithms were trained on this cohort using 5‐fold cross‐validation. XGBoost was used to rank the variable importance. RESULTS: Four subphenotypes were computationally derived. Subphenotype A (n = 273; 28.2%) had more patients with cardiovascular diseases; subphenotype B (n = 221; 27.9%) had more patients with mental health illnesses, such as depression and anxiety; patients in subphenotype C (n = 183; 23.1%) were overall older (mean (SD) age, 79.5 (5.4) years) and had the most comorbidities including diabetes, cardiovascular diseases, and mental health disorders; and subphenotype D (n = 115; 14.5%) included patients who took anti‐dementia drugs and had sensory problems, such as deafness and hearing impairment. The 0‐year prediction model for AD risk achieved an area under the receiver operating curve (AUC) of 0.764 (SD: 0.02); the 6‐month model, 0.751 (SD: 0.02); the 1‐year model, 0.752 (SD: 0.02); the 2‐year model, 0.749 (SD: 0.03); and the 3‐year model, 0.735 (SD: 0.03), respectively. Based on variable importance, the top‐ranked comorbidities included depression, stroke/transient ischemic attack, hypertension, anxiety, mobility impairments, and atrial fibrillation. The top‐ranked medications included anti‐dementia drugs, antipsychotics, antiepileptics, and antidepressants. CONCLUSIONS: Four subphenotypes were computationally derived that correlated with cardiovascular diseases and mental health illnesses. ML algorithms based on patient demographics, diagnosis, and treatment demonstrated promising results in predicting the risk of developing AD at different time points across an individual's lifespan. John Wiley and Sons Inc. 2020-09-10 /pmc/articles/PMC7556420/ /pubmed/33083543 http://dx.doi.org/10.1002/lrh2.10246 Text en © 2020 The Authors. Learning Health Systems published by Wiley Periodicals LLC on behalf of the University of Michigan. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Reports
Xu, Jie
Wang, Fei
Xu, Zhenxing
Adekkanattu, Prakash
Brandt, Pascal
Jiang, Guoqian
Kiefer, Richard C.
Luo, Yuan
Mao, Chengsheng
Pacheco, Jennifer A.
Rasmussen, Luke V.
Zhang, Yiye
Isaacson, Richard
Pathak, Jyotishman
Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records
title Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records
title_full Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records
title_fullStr Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records
title_full_unstemmed Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records
title_short Data‐driven discovery of probable Alzheimer's disease and related dementia subphenotypes using electronic health records
title_sort data‐driven discovery of probable alzheimer's disease and related dementia subphenotypes using electronic health records
topic Research Reports
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7556420/
https://www.ncbi.nlm.nih.gov/pubmed/33083543
http://dx.doi.org/10.1002/lrh2.10246
work_keys_str_mv AT xujie datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT wangfei datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT xuzhenxing datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT adekkanattuprakash datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT brandtpascal datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT jiangguoqian datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT kieferrichardc datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT luoyuan datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT maochengsheng datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT pachecojennifera datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT rasmussenlukev datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT zhangyiye datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT isaacsonrichard datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords
AT pathakjyotishman datadrivendiscoveryofprobablealzheimersdiseaseandrelateddementiasubphenotypesusingelectronichealthrecords