Cargando…

Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records

The application of machine learning (ML) tools in electronic health records (EHRs) can help reduce the underdiagnosis of dementia, but models that are not designed to reflect minority population may perpetuate that underdiagnosis. To address the underdiagnosis of dementia in both Black Americans (BA...

Descripción completa

Detalles Bibliográficos
Autores principales: Shao, Yijun, Todd, Kaitlin, Shutes-David, Andrew, Millard, Steven P., Brown, Karl, Thomas, Amy, Chen, Kathryn, Wilson, Katherine, Zeng, Qing T., Tsuang, Debby W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9934793/
https://www.ncbi.nlm.nih.gov/pubmed/36798376
http://dx.doi.org/10.1101/2023.02.08.23285540
_version_ 1784889950495309824
author Shao, Yijun
Todd, Kaitlin
Shutes-David, Andrew
Millard, Steven P.
Brown, Karl
Thomas, Amy
Chen, Kathryn
Wilson, Katherine
Zeng, Qing T.
Tsuang, Debby W.
author_facet Shao, Yijun
Todd, Kaitlin
Shutes-David, Andrew
Millard, Steven P.
Brown, Karl
Thomas, Amy
Chen, Kathryn
Wilson, Katherine
Zeng, Qing T.
Tsuang, Debby W.
author_sort Shao, Yijun
collection PubMed
description The application of machine learning (ML) tools in electronic health records (EHRs) can help reduce the underdiagnosis of dementia, but models that are not designed to reflect minority population may perpetuate that underdiagnosis. To address the underdiagnosis of dementia in both Black Americans (BAs) and white Americans (WAs), we sought to develop and validate ML models that assign race-specific risk scores. These scores were used to identify undiagnosed dementia in BA and WA Veterans in EHRs. More specifically, risk scores were generated separately for BAs (n=10K) and WAs (n=10K) in training samples of cases and controls by performing ML, equivalence mapping, topic modeling, and a support vector-machine (SVM) in structured and unstructured EHR data. Scores were validated via blinded manual chart reviews (n=1.2K) of controls from a separate sample (n=20K). AUCs and negative and positive predictive values (NPVs and PPVs) were calculated to evaluate the models. There was a strong positive relationship between SVM-generated risk scores and undiagnosed dementia. BAs were more likely than WAs to have undiagnosed dementia per chart review, both overall (15.3% vs 9.5%) and among Veterans with >90(th) percentile cutoff scores (25.6% vs 15.3%). With chart reviews as the reference standard and varied cutoff scores, the BA model performed slightly better than the WA model (AUC=0.86 with NPV=0.98 and PPV=0.26 at >90(th) percentile cutoff vs AUC=0.77 with NPV=0.98 and PPV=0.15 at >90(th)). The AUCs, NPVs, and PPVs suggest that race-specific ML models can assist in the identification of undiagnosed dementia, particularly in BAs. Future studies should investigate implementing EHR-based risk scores in clinics that serve both BA and WA Veterans.
format Online
Article
Text
id pubmed-9934793
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-99347932023-02-17 Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records Shao, Yijun Todd, Kaitlin Shutes-David, Andrew Millard, Steven P. Brown, Karl Thomas, Amy Chen, Kathryn Wilson, Katherine Zeng, Qing T. Tsuang, Debby W. medRxiv Article The application of machine learning (ML) tools in electronic health records (EHRs) can help reduce the underdiagnosis of dementia, but models that are not designed to reflect minority population may perpetuate that underdiagnosis. To address the underdiagnosis of dementia in both Black Americans (BAs) and white Americans (WAs), we sought to develop and validate ML models that assign race-specific risk scores. These scores were used to identify undiagnosed dementia in BA and WA Veterans in EHRs. More specifically, risk scores were generated separately for BAs (n=10K) and WAs (n=10K) in training samples of cases and controls by performing ML, equivalence mapping, topic modeling, and a support vector-machine (SVM) in structured and unstructured EHR data. Scores were validated via blinded manual chart reviews (n=1.2K) of controls from a separate sample (n=20K). AUCs and negative and positive predictive values (NPVs and PPVs) were calculated to evaluate the models. There was a strong positive relationship between SVM-generated risk scores and undiagnosed dementia. BAs were more likely than WAs to have undiagnosed dementia per chart review, both overall (15.3% vs 9.5%) and among Veterans with >90(th) percentile cutoff scores (25.6% vs 15.3%). With chart reviews as the reference standard and varied cutoff scores, the BA model performed slightly better than the WA model (AUC=0.86 with NPV=0.98 and PPV=0.26 at >90(th) percentile cutoff vs AUC=0.77 with NPV=0.98 and PPV=0.15 at >90(th)). The AUCs, NPVs, and PPVs suggest that race-specific ML models can assist in the identification of undiagnosed dementia, particularly in BAs. Future studies should investigate implementing EHR-based risk scores in clinics that serve both BA and WA Veterans. Cold Spring Harbor Laboratory 2023-02-14 /pmc/articles/PMC9934793/ /pubmed/36798376 http://dx.doi.org/10.1101/2023.02.08.23285540 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Shao, Yijun
Todd, Kaitlin
Shutes-David, Andrew
Millard, Steven P.
Brown, Karl
Thomas, Amy
Chen, Kathryn
Wilson, Katherine
Zeng, Qing T.
Tsuang, Debby W.
Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
title Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
title_full Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
title_fullStr Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
title_full_unstemmed Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
title_short Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
title_sort identifying probable dementia in undiagnosed black and white americans using machine learning in veterans health administration electronic health records
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9934793/
https://www.ncbi.nlm.nih.gov/pubmed/36798376
http://dx.doi.org/10.1101/2023.02.08.23285540
work_keys_str_mv AT shaoyijun identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT toddkaitlin identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT shutesdavidandrew identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT millardstevenp identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT brownkarl identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT thomasamy identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT chenkathryn identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT wilsonkatherine identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT zengqingt identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords
AT tsuangdebbyw identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords