Cargando…

Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol

INTRODUCTION: Type 2 diabetes mellitus (T2DM) is a major cause of blindness, kidney failure, myocardial infarction, stroke and lower limb amputation. We are still unable, however, to accurately predict or identify which patients are at a higher risk of deterioration. Most risk stratification tools d...

Descripción completa

Detalles Bibliográficos
Autores principales: Neves, Ana Luisa, Pereira Rodrigues, Pedro, Mulla, Abdulrahim, Glampson, Ben, Willis, Tony, Darzi, Ara, Mayer, Erik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BMJ Publishing Group 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8327849/
https://www.ncbi.nlm.nih.gov/pubmed/34330856
http://dx.doi.org/10.1136/bmjopen-2020-046716
_version_ 1783732179854950400
author Neves, Ana Luisa
Pereira Rodrigues, Pedro
Mulla, Abdulrahim
Glampson, Ben
Willis, Tony
Darzi, Ara
Mayer, Erik
author_facet Neves, Ana Luisa
Pereira Rodrigues, Pedro
Mulla, Abdulrahim
Glampson, Ben
Willis, Tony
Darzi, Ara
Mayer, Erik
author_sort Neves, Ana Luisa
collection PubMed
description INTRODUCTION: Type 2 diabetes mellitus (T2DM) is a major cause of blindness, kidney failure, myocardial infarction, stroke and lower limb amputation. We are still unable, however, to accurately predict or identify which patients are at a higher risk of deterioration. Most risk stratification tools do not account for novel factors such as sociodemographic determinants, self-management ability or access to healthcare. Additionally, most tools are based in clinical trials, with limited external generalisability. OBJECTIVE: The aim of this work is to design and validate a machine learning-based tool to identify patients with T2DM at high risk of clinical deterioration, based on a comprehensive set of patient-level characteristics retrieved from a population health linked dataset. SAMPLE AND DESIGN: Retrospective cohort study of patients with diagnosis of T2DM on 1 January 2015, with a 5-year follow-up. Anonymised electronic healthcare records from the Whole System Integrated Care (WSIC) database will be used. PRELIMINARY OUTCOMES: Outcome variables of clinical deterioration will include retinopathy, chronic renal disease, myocardial infarction, stroke, peripheral arterial disease or death. Predictor variables will include sociodemographic and geographic data, patients’ ability to self-manage disease, clinical and metabolic parameters and healthcare service usage. Prognostic models will be defined using multidependence Bayesian networks. The derivation cohort, comprising 80% of the patients, will be used to define the prognostic models. Model parameters will be internally validated by comparing the area under the receiver operating characteristic curve in the derivation cohort with those calculated from a leave-one-out and a 10 times twofold cross-validation. ETHICS AND DISSEMINATION: The study has received approvals from the Information Governance Committee at the WSIC. Results will be made available to people with T2DM, their caregivers, the funders, diabetes care societies and other researchers.
format Online
Article
Text
id pubmed-8327849
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BMJ Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-83278492021-08-19 Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol Neves, Ana Luisa Pereira Rodrigues, Pedro Mulla, Abdulrahim Glampson, Ben Willis, Tony Darzi, Ara Mayer, Erik BMJ Open Health Informatics INTRODUCTION: Type 2 diabetes mellitus (T2DM) is a major cause of blindness, kidney failure, myocardial infarction, stroke and lower limb amputation. We are still unable, however, to accurately predict or identify which patients are at a higher risk of deterioration. Most risk stratification tools do not account for novel factors such as sociodemographic determinants, self-management ability or access to healthcare. Additionally, most tools are based in clinical trials, with limited external generalisability. OBJECTIVE: The aim of this work is to design and validate a machine learning-based tool to identify patients with T2DM at high risk of clinical deterioration, based on a comprehensive set of patient-level characteristics retrieved from a population health linked dataset. SAMPLE AND DESIGN: Retrospective cohort study of patients with diagnosis of T2DM on 1 January 2015, with a 5-year follow-up. Anonymised electronic healthcare records from the Whole System Integrated Care (WSIC) database will be used. PRELIMINARY OUTCOMES: Outcome variables of clinical deterioration will include retinopathy, chronic renal disease, myocardial infarction, stroke, peripheral arterial disease or death. Predictor variables will include sociodemographic and geographic data, patients’ ability to self-manage disease, clinical and metabolic parameters and healthcare service usage. Prognostic models will be defined using multidependence Bayesian networks. The derivation cohort, comprising 80% of the patients, will be used to define the prognostic models. Model parameters will be internally validated by comparing the area under the receiver operating characteristic curve in the derivation cohort with those calculated from a leave-one-out and a 10 times twofold cross-validation. ETHICS AND DISSEMINATION: The study has received approvals from the Information Governance Committee at the WSIC. Results will be made available to people with T2DM, their caregivers, the funders, diabetes care societies and other researchers. BMJ Publishing Group 2021-07-30 /pmc/articles/PMC8327849/ /pubmed/34330856 http://dx.doi.org/10.1136/bmjopen-2020-046716 Text en © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY. Published by BMJ. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed in accordance with the Creative Commons Attribution 4.0 Unported (CC BY 4.0) license, which permits others to copy, redistribute, remix, transform and build upon this work for any purpose, provided the original work is properly cited, a link to the licence is given, and indication of whether changes were made. See: https://creativecommons.org/licenses/by/4.0/.
spellingShingle Health Informatics
Neves, Ana Luisa
Pereira Rodrigues, Pedro
Mulla, Abdulrahim
Glampson, Ben
Willis, Tony
Darzi, Ara
Mayer, Erik
Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
title Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
title_full Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
title_fullStr Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
title_full_unstemmed Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
title_short Using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
title_sort using electronic health records to develop and validate a machine-learning tool to predict type 2 diabetes outcomes: a study protocol
topic Health Informatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8327849/
https://www.ncbi.nlm.nih.gov/pubmed/34330856
http://dx.doi.org/10.1136/bmjopen-2020-046716
work_keys_str_mv AT nevesanaluisa usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol
AT pereirarodriguespedro usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol
AT mullaabdulrahim usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol
AT glampsonben usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol
AT willistony usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol
AT darziara usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol
AT mayererik usingelectronichealthrecordstodevelopandvalidateamachinelearningtooltopredicttype2diabetesoutcomesastudyprotocol