Cargando…

Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach

BACKGROUND: Machine learning-based risk prediction models may outperform traditional statistical models in large datasets with many variables, by identifying both novel predictors and the complex interactions between them. This study compared deep learning extensions of survival analysis models with...

Descripción completa

Detalles Bibliográficos
Autores principales: Barbieri, Sebastiano, Mehta, Suneela, Wu, Billy, Bharat, Chrianna, Poppe, Katrina, Jorm, Louisa, Jackson, Rod
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9189958/
https://www.ncbi.nlm.nih.gov/pubmed/34910160
http://dx.doi.org/10.1093/ije/dyab258
_version_ 1784725698846392320
author Barbieri, Sebastiano
Mehta, Suneela
Wu, Billy
Bharat, Chrianna
Poppe, Katrina
Jorm, Louisa
Jackson, Rod
author_facet Barbieri, Sebastiano
Mehta, Suneela
Wu, Billy
Bharat, Chrianna
Poppe, Katrina
Jorm, Louisa
Jackson, Rod
author_sort Barbieri, Sebastiano
collection PubMed
description BACKGROUND: Machine learning-based risk prediction models may outperform traditional statistical models in large datasets with many variables, by identifying both novel predictors and the complex interactions between them. This study compared deep learning extensions of survival analysis models with Cox proportional hazards models for predicting cardiovascular disease (CVD) risk in national health administrative datasets. METHODS: Using individual person linkage of administrative datasets, we constructed a cohort of all New Zealanders aged 30–74 who interacted with public health services during 2012. After excluding people with prior CVD, we developed sex-specific deep learning and Cox proportional hazards models to estimate the risk of CVD events within 5 years. Models were compared based on the proportion of explained variance, model calibration and discrimination, and hazard ratios for predictor variables. RESULTS: First CVD events occurred in 61 927 of 2 164 872 people. Within the reference group, the largest hazard ratios estimated by the deep learning models were for tobacco use in women (2.04, 95% CI: 1.99, 2.10) and chronic obstructive pulmonary disease with acute lower respiratory infection in men (1.56, 95% CI: 1.50, 1.62). Other identified predictors (e.g. hypertension, chest pain, diabetes) aligned with current knowledge about CVD risk factors. Deep learning outperformed Cox proportional hazards models on the basis of proportion of explained variance (R(2): 0.468 vs 0.425 in women and 0.383 vs 0.348 in men), calibration and discrimination (all P <0.0001). CONCLUSIONS: Deep learning extensions of survival analysis models can be applied to large health administrative datasets to derive interpretable CVD risk prediction equations that are more accurate than traditional Cox proportional hazards models.
format Online
Article
Text
id pubmed-9189958
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-91899582022-06-14 Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach Barbieri, Sebastiano Mehta, Suneela Wu, Billy Bharat, Chrianna Poppe, Katrina Jorm, Louisa Jackson, Rod Int J Epidemiol Methods BACKGROUND: Machine learning-based risk prediction models may outperform traditional statistical models in large datasets with many variables, by identifying both novel predictors and the complex interactions between them. This study compared deep learning extensions of survival analysis models with Cox proportional hazards models for predicting cardiovascular disease (CVD) risk in national health administrative datasets. METHODS: Using individual person linkage of administrative datasets, we constructed a cohort of all New Zealanders aged 30–74 who interacted with public health services during 2012. After excluding people with prior CVD, we developed sex-specific deep learning and Cox proportional hazards models to estimate the risk of CVD events within 5 years. Models were compared based on the proportion of explained variance, model calibration and discrimination, and hazard ratios for predictor variables. RESULTS: First CVD events occurred in 61 927 of 2 164 872 people. Within the reference group, the largest hazard ratios estimated by the deep learning models were for tobacco use in women (2.04, 95% CI: 1.99, 2.10) and chronic obstructive pulmonary disease with acute lower respiratory infection in men (1.56, 95% CI: 1.50, 1.62). Other identified predictors (e.g. hypertension, chest pain, diabetes) aligned with current knowledge about CVD risk factors. Deep learning outperformed Cox proportional hazards models on the basis of proportion of explained variance (R(2): 0.468 vs 0.425 in women and 0.383 vs 0.348 in men), calibration and discrimination (all P <0.0001). CONCLUSIONS: Deep learning extensions of survival analysis models can be applied to large health administrative datasets to derive interpretable CVD risk prediction equations that are more accurate than traditional Cox proportional hazards models. Oxford University Press 2021-12-15 /pmc/articles/PMC9189958/ /pubmed/34910160 http://dx.doi.org/10.1093/ije/dyab258 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the International Epidemiological Association. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Barbieri, Sebastiano
Mehta, Suneela
Wu, Billy
Bharat, Chrianna
Poppe, Katrina
Jorm, Louisa
Jackson, Rod
Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
title Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
title_full Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
title_fullStr Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
title_full_unstemmed Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
title_short Predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
title_sort predicting cardiovascular risk from national administrative databases using a combined survival analysis and deep learning approach
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9189958/
https://www.ncbi.nlm.nih.gov/pubmed/34910160
http://dx.doi.org/10.1093/ije/dyab258
work_keys_str_mv AT barbierisebastiano predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach
AT mehtasuneela predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach
AT wubilly predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach
AT bharatchrianna predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach
AT poppekatrina predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach
AT jormlouisa predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach
AT jacksonrod predictingcardiovascularriskfromnationaladministrativedatabasesusingacombinedsurvivalanalysisanddeeplearningapproach