Cargando…

Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study

BACKGROUND: Clinical trials need efficient tools to assist in recruiting patients at risk of Alzheimer disease and related dementias (ADRD). Early detection can also assist patients with financial planning for long-term care. Clinical notes are an important, underutilized source of information in ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Hane, Christopher A, Nori, Vijay S, Crown, William H, Sanghavi, Darshak M, Bleicher, Paul
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7301255/
https://www.ncbi.nlm.nih.gov/pubmed/32490841
http://dx.doi.org/10.2196/17819
_version_ 1783547656296988672
author Hane, Christopher A
Nori, Vijay S
Crown, William H
Sanghavi, Darshak M
Bleicher, Paul
author_facet Hane, Christopher A
Nori, Vijay S
Crown, William H
Sanghavi, Darshak M
Bleicher, Paul
author_sort Hane, Christopher A
collection PubMed
description BACKGROUND: Clinical trials need efficient tools to assist in recruiting patients at risk of Alzheimer disease and related dementias (ADRD). Early detection can also assist patients with financial planning for long-term care. Clinical notes are an important, underutilized source of information in machine learning models because of the cost of collection and complexity of analysis. OBJECTIVE: This study aimed to investigate the use of deidentified clinical notes from multiple hospital systems collected over 10 years to augment retrospective machine learning models of the risk of developing ADRD. METHODS: We used 2 years of data to predict the future outcome of ADRD onset. Clinical notes are provided in a deidentified format with specific terms and sentiments. Terms in clinical notes are embedded into a 100-dimensional vector space to identify clusters of related terms and abbreviations that differ across hospital systems and individual clinicians. RESULTS: When using clinical notes, the area under the curve (AUC) improved from 0.85 to 0.94, and positive predictive value (PPV) increased from 45.07% (25,245/56,018) to 68.32% (14,153/20,717) in the model at disease onset. Models with clinical notes improved in both AUC and PPV in years 3-6 when notes’ volume was largest; results are mixed in years 7 and 8 with the smallest cohorts. CONCLUSIONS: Although clinical notes helped in the short term, the presence of ADRD symptomatic terms years earlier than onset adds evidence to other studies that clinicians undercode diagnoses of ADRD. De-identified clinical notes increase the accuracy of risk models. Clinical notes collected across multiple hospital systems via natural language processing can be merged using postprocessing techniques to aid model accuracy.
format Online
Article
Text
id pubmed-7301255
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-73012552020-08-12 Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study Hane, Christopher A Nori, Vijay S Crown, William H Sanghavi, Darshak M Bleicher, Paul JMIR Med Inform Original Paper BACKGROUND: Clinical trials need efficient tools to assist in recruiting patients at risk of Alzheimer disease and related dementias (ADRD). Early detection can also assist patients with financial planning for long-term care. Clinical notes are an important, underutilized source of information in machine learning models because of the cost of collection and complexity of analysis. OBJECTIVE: This study aimed to investigate the use of deidentified clinical notes from multiple hospital systems collected over 10 years to augment retrospective machine learning models of the risk of developing ADRD. METHODS: We used 2 years of data to predict the future outcome of ADRD onset. Clinical notes are provided in a deidentified format with specific terms and sentiments. Terms in clinical notes are embedded into a 100-dimensional vector space to identify clusters of related terms and abbreviations that differ across hospital systems and individual clinicians. RESULTS: When using clinical notes, the area under the curve (AUC) improved from 0.85 to 0.94, and positive predictive value (PPV) increased from 45.07% (25,245/56,018) to 68.32% (14,153/20,717) in the model at disease onset. Models with clinical notes improved in both AUC and PPV in years 3-6 when notes’ volume was largest; results are mixed in years 7 and 8 with the smallest cohorts. CONCLUSIONS: Although clinical notes helped in the short term, the presence of ADRD symptomatic terms years earlier than onset adds evidence to other studies that clinicians undercode diagnoses of ADRD. De-identified clinical notes increase the accuracy of risk models. Clinical notes collected across multiple hospital systems via natural language processing can be merged using postprocessing techniques to aid model accuracy. JMIR Publications 2020-06-03 /pmc/articles/PMC7301255/ /pubmed/32490841 http://dx.doi.org/10.2196/17819 Text en ©Christopher A Hane, Vijay S Nori, William H Crown, Darshak M Sanghavi, Paul Bleicher. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 03.06.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Hane, Christopher A
Nori, Vijay S
Crown, William H
Sanghavi, Darshak M
Bleicher, Paul
Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study
title Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study
title_full Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study
title_fullStr Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study
title_full_unstemmed Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study
title_short Predicting Onset of Dementia Using Clinical Notes and Machine Learning: Case-Control Study
title_sort predicting onset of dementia using clinical notes and machine learning: case-control study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7301255/
https://www.ncbi.nlm.nih.gov/pubmed/32490841
http://dx.doi.org/10.2196/17819
work_keys_str_mv AT hanechristophera predictingonsetofdementiausingclinicalnotesandmachinelearningcasecontrolstudy
AT norivijays predictingonsetofdementiausingclinicalnotesandmachinelearningcasecontrolstudy
AT crownwilliamh predictingonsetofdementiausingclinicalnotesandmachinelearningcasecontrolstudy
AT sanghavidarshakm predictingonsetofdementiausingclinicalnotesandmachinelearningcasecontrolstudy
AT bleicherpaul predictingonsetofdementiausingclinicalnotesandmachinelearningcasecontrolstudy