Cargando…

Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar

OBJECTIVE: To assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions. DESIGN: Longitudinal cohort study from 1 January 1998 to 31 December 2018. SETTING...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yan, Sperrin, Matthew, Ashcroft, Darren M, van Staa, Tjeerd Pieter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group Ltd. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7610202/
https://www.ncbi.nlm.nih.gov/pubmed/33148619
http://dx.doi.org/10.1136/bmj.m3919
_version_ 1783605153866186752
author Li, Yan
Sperrin, Matthew
Ashcroft, Darren M
van Staa, Tjeerd Pieter
author_facet Li, Yan
Sperrin, Matthew
Ashcroft, Darren M
van Staa, Tjeerd Pieter
author_sort Li, Yan
collection PubMed
description OBJECTIVE: To assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions. DESIGN: Longitudinal cohort study from 1 January 1998 to 31 December 2018. SETTING AND PARTICIPANTS: 3.6 million patients from the Clinical Practice Research Datalink registered at 391 general practices in England with linked hospital admission and mortality records. MAIN OUTCOME MEASURES: Model performance including discrimination, calibration, and consistency of individual risk prediction for the same patients among models with comparable model performance. 19 different prediction techniques were applied, including 12 families of machine learning models (grid searched for best models), three Cox proportional hazards models (local fitted, QRISK3, and Framingham), three parametric survival models, and one logistic model. RESULTS: The various models had similar population level performance (C statistics of about 0.87 and similar calibration). However, the predictions for individual risks of cardiovascular disease varied widely between and within different types of machine learning and statistical models, especially in patients with higher risks. A patient with a risk of 9.5-10.5% predicted by QRISK3 had a risk of 2.9-9.2% in a random forest and 2.4-7.2% in a neural network. The differences in predicted risks between QRISK3 and a neural network ranged between –23.2% and 0.1% (95% range). Models that ignored censoring (that is, assumed censored patients to be event free) substantially underestimated risk of cardiovascular disease. Of the 223 815 patients with a cardiovascular disease risk above 7.5% with QRISK3, 57.8% would be reclassified below 7.5% when using another model. CONCLUSIONS: A variety of models predicted risks for the same patients very differently despite similar model performances. The logistic models and commonly used machine learning models should not be directly applied to the prediction of long term risks without considering censoring. Survival models that consider censoring and that are explainable, such as QRISK3, are preferable. The level of consistency within and between models should be routinely assessed before they are used for clinical decision making.
format Online
Article
Text
id pubmed-7610202
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BMJ Publishing Group Ltd.
record_format MEDLINE/PubMed
spelling pubmed-76102022020-11-12 Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar Li, Yan Sperrin, Matthew Ashcroft, Darren M van Staa, Tjeerd Pieter BMJ Research OBJECTIVE: To assess the consistency of machine learning and statistical techniques in predicting individual level and population level risks of cardiovascular disease and the effects of censoring on risk predictions. DESIGN: Longitudinal cohort study from 1 January 1998 to 31 December 2018. SETTING AND PARTICIPANTS: 3.6 million patients from the Clinical Practice Research Datalink registered at 391 general practices in England with linked hospital admission and mortality records. MAIN OUTCOME MEASURES: Model performance including discrimination, calibration, and consistency of individual risk prediction for the same patients among models with comparable model performance. 19 different prediction techniques were applied, including 12 families of machine learning models (grid searched for best models), three Cox proportional hazards models (local fitted, QRISK3, and Framingham), three parametric survival models, and one logistic model. RESULTS: The various models had similar population level performance (C statistics of about 0.87 and similar calibration). However, the predictions for individual risks of cardiovascular disease varied widely between and within different types of machine learning and statistical models, especially in patients with higher risks. A patient with a risk of 9.5-10.5% predicted by QRISK3 had a risk of 2.9-9.2% in a random forest and 2.4-7.2% in a neural network. The differences in predicted risks between QRISK3 and a neural network ranged between –23.2% and 0.1% (95% range). Models that ignored censoring (that is, assumed censored patients to be event free) substantially underestimated risk of cardiovascular disease. Of the 223 815 patients with a cardiovascular disease risk above 7.5% with QRISK3, 57.8% would be reclassified below 7.5% when using another model. CONCLUSIONS: A variety of models predicted risks for the same patients very differently despite similar model performances. The logistic models and commonly used machine learning models should not be directly applied to the prediction of long term risks without considering censoring. Survival models that consider censoring and that are explainable, such as QRISK3, are preferable. The level of consistency within and between models should be routinely assessed before they are used for clinical decision making. BMJ Publishing Group Ltd. 2020-11-04 /pmc/articles/PMC7610202/ /pubmed/33148619 http://dx.doi.org/10.1136/bmj.m3919 Text en © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ. http://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/.
spellingShingle Research
Li, Yan
Sperrin, Matthew
Ashcroft, Darren M
van Staa, Tjeerd Pieter
Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
title Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
title_full Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
title_fullStr Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
title_full_unstemmed Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
title_short Consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
title_sort consistency of variety of machine learning and statistical models in predicting clinical risks of individual patients: longitudinal cohort study using cardiovascular disease as exemplar
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7610202/
https://www.ncbi.nlm.nih.gov/pubmed/33148619
http://dx.doi.org/10.1136/bmj.m3919
work_keys_str_mv AT liyan consistencyofvarietyofmachinelearningandstatisticalmodelsinpredictingclinicalrisksofindividualpatientslongitudinalcohortstudyusingcardiovasculardiseaseasexemplar
AT sperrinmatthew consistencyofvarietyofmachinelearningandstatisticalmodelsinpredictingclinicalrisksofindividualpatientslongitudinalcohortstudyusingcardiovasculardiseaseasexemplar
AT ashcroftdarrenm consistencyofvarietyofmachinelearningandstatisticalmodelsinpredictingclinicalrisksofindividualpatientslongitudinalcohortstudyusingcardiovasculardiseaseasexemplar
AT vanstaatjeerdpieter consistencyofvarietyofmachinelearningandstatisticalmodelsinpredictingclinicalrisksofindividualpatientslongitudinalcohortstudyusingcardiovasculardiseaseasexemplar