Cargando…

Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches

BACKGROUND: Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvem...

Descripción completa

Detalles Bibliográficos
Autores principales: Ford, Elizabeth, Rooney, Philip, Oliver, Seb, Hoile, Richard, Hurley, Peter, Banerjee, Sube, van Marwijk, Harm, Cassell, Jackie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6889642/
https://www.ncbi.nlm.nih.gov/pubmed/31791325
http://dx.doi.org/10.1186/s12911-019-0991-9
_version_ 1783475463939686400
author Ford, Elizabeth
Rooney, Philip
Oliver, Seb
Hoile, Richard
Hurley, Peter
Banerjee, Sube
van Marwijk, Harm
Cassell, Jackie
author_facet Ford, Elizabeth
Rooney, Philip
Oliver, Seb
Hoile, Richard
Hurley, Peter
Banerjee, Sube
van Marwijk, Harm
Cassell, Jackie
author_sort Ford, Elizabeth
collection PubMed
description BACKGROUND: Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP. METHODS: We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination. RESULTS: The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing. CONCLUSIONS: Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time.
format Online
Article
Text
id pubmed-6889642
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-68896422019-12-11 Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches Ford, Elizabeth Rooney, Philip Oliver, Seb Hoile, Richard Hurley, Peter Banerjee, Sube van Marwijk, Harm Cassell, Jackie BMC Med Inform Decis Mak Research Article BACKGROUND: Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP. METHODS: We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination. RESULTS: The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing. CONCLUSIONS: Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time. BioMed Central 2019-12-02 /pmc/articles/PMC6889642/ /pubmed/31791325 http://dx.doi.org/10.1186/s12911-019-0991-9 Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Ford, Elizabeth
Rooney, Philip
Oliver, Seb
Hoile, Richard
Hurley, Peter
Banerjee, Sube
van Marwijk, Harm
Cassell, Jackie
Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
title Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
title_full Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
title_fullStr Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
title_full_unstemmed Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
title_short Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
title_sort identifying undetected dementia in uk primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6889642/
https://www.ncbi.nlm.nih.gov/pubmed/31791325
http://dx.doi.org/10.1186/s12911-019-0991-9
work_keys_str_mv AT fordelizabeth identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT rooneyphilip identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT oliverseb identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT hoilerichard identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT hurleypeter identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT banerjeesube identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT vanmarwijkharm identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches
AT casselljackie identifyingundetecteddementiainukprimarycarepatientsaretrospectivecasecontrolstudycomparingmachinelearningandstandardepidemiologicalapproaches