Cargando…

Predicting hypertension onset from longitudinal electronic health records with deep learning

OBJECTIVE: Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) h...

Descripción completa

Detalles Bibliográficos
Autores principales: Datta, Suparno, Morassi Sasso, Ariane, Kiwit, Nina, Bose, Subhronil, Nadkarni, Girish, Miotto, Riccardo, Böttinger, Erwin P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9696747/
https://www.ncbi.nlm.nih.gov/pubmed/36448021
http://dx.doi.org/10.1093/jamiaopen/ooac097
_version_ 1784838388334985216
author Datta, Suparno
Morassi Sasso, Ariane
Kiwit, Nina
Bose, Subhronil
Nadkarni, Girish
Miotto, Riccardo
Böttinger, Erwin P
author_facet Datta, Suparno
Morassi Sasso, Ariane
Kiwit, Nina
Bose, Subhronil
Nadkarni, Girish
Miotto, Riccardo
Böttinger, Erwin P
author_sort Datta, Suparno
collection PubMed
description OBJECTIVE: Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. MATERIALS AND METHODS: We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A “train and validation”) using cross-validation, and then applied the models to a second dataset (dataset B “test”) to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. RESULTS: With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the “train and validation” dataset A and 0.94 in the “test” dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. CONCLUSION: These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension.
format Online
Article
Text
id pubmed-9696747
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-96967472022-11-28 Predicting hypertension onset from longitudinal electronic health records with deep learning Datta, Suparno Morassi Sasso, Ariane Kiwit, Nina Bose, Subhronil Nadkarni, Girish Miotto, Riccardo Böttinger, Erwin P JAMIA Open Research and Applications OBJECTIVE: Hypertension has long been recognized as one of the most important predisposing factors for cardiovascular diseases and mortality. In recent years, machine learning methods have shown potential in diagnostic and predictive approaches in chronic diseases. Electronic health records (EHRs) have emerged as a reliable source of longitudinal data. The aim of this study is to predict the onset of hypertension using modern deep learning (DL) architectures, specifically long short-term memory (LSTM) networks, and longitudinal EHRs. MATERIALS AND METHODS: We compare this approach to the best performing models reported from previous works, particularly XGboost, applied to aggregated features. Our work is based on data from 233 895 adult patients from a large health system in the United States. We divided our population into 2 distinct longitudinal datasets based on the diagnosis date. To ensure generalization to unseen data, we trained our models on the first dataset (dataset A “train and validation”) using cross-validation, and then applied the models to a second dataset (dataset B “test”) to assess their performance. We also experimented with 2 different time-windows before the onset of hypertension and evaluated the impact on model performance. RESULTS: With the LSTM network, we were able to achieve an area under the receiver operating characteristic curve value of 0.98 in the “train and validation” dataset A and 0.94 in the “test” dataset B for a prediction time window of 1 year. Lipid disorders, type 2 diabetes, and renal disorders are found to be associated with incident hypertension. CONCLUSION: These findings show that DL models based on temporal EHR data can improve the identification of patients at high risk of hypertension and corresponding driving factors. In the long term, this work may support identifying individuals who are at high risk for developing hypertension and facilitate earlier intervention to prevent the future development of hypertension. Oxford University Press 2022-11-25 /pmc/articles/PMC9696747/ /pubmed/36448021 http://dx.doi.org/10.1093/jamiaopen/ooac097 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research and Applications
Datta, Suparno
Morassi Sasso, Ariane
Kiwit, Nina
Bose, Subhronil
Nadkarni, Girish
Miotto, Riccardo
Böttinger, Erwin P
Predicting hypertension onset from longitudinal electronic health records with deep learning
title Predicting hypertension onset from longitudinal electronic health records with deep learning
title_full Predicting hypertension onset from longitudinal electronic health records with deep learning
title_fullStr Predicting hypertension onset from longitudinal electronic health records with deep learning
title_full_unstemmed Predicting hypertension onset from longitudinal electronic health records with deep learning
title_short Predicting hypertension onset from longitudinal electronic health records with deep learning
title_sort predicting hypertension onset from longitudinal electronic health records with deep learning
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9696747/
https://www.ncbi.nlm.nih.gov/pubmed/36448021
http://dx.doi.org/10.1093/jamiaopen/ooac097
work_keys_str_mv AT dattasuparno predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning
AT morassisassoariane predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning
AT kiwitnina predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning
AT bosesubhronil predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning
AT nadkarnigirish predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning
AT miottoriccardo predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning
AT bottingererwinp predictinghypertensiononsetfromlongitudinalelectronichealthrecordswithdeeplearning