Cargando…
Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records
The application of machine learning (ML) tools in electronic health records (EHRs) can help reduce the underdiagnosis of dementia, but models that are not designed to reflect minority population may perpetuate that underdiagnosis. To address the underdiagnosis of dementia in both Black Americans (BA...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9934793/ https://www.ncbi.nlm.nih.gov/pubmed/36798376 http://dx.doi.org/10.1101/2023.02.08.23285540 |
_version_ | 1784889950495309824 |
---|---|
author | Shao, Yijun Todd, Kaitlin Shutes-David, Andrew Millard, Steven P. Brown, Karl Thomas, Amy Chen, Kathryn Wilson, Katherine Zeng, Qing T. Tsuang, Debby W. |
author_facet | Shao, Yijun Todd, Kaitlin Shutes-David, Andrew Millard, Steven P. Brown, Karl Thomas, Amy Chen, Kathryn Wilson, Katherine Zeng, Qing T. Tsuang, Debby W. |
author_sort | Shao, Yijun |
collection | PubMed |
description | The application of machine learning (ML) tools in electronic health records (EHRs) can help reduce the underdiagnosis of dementia, but models that are not designed to reflect minority population may perpetuate that underdiagnosis. To address the underdiagnosis of dementia in both Black Americans (BAs) and white Americans (WAs), we sought to develop and validate ML models that assign race-specific risk scores. These scores were used to identify undiagnosed dementia in BA and WA Veterans in EHRs. More specifically, risk scores were generated separately for BAs (n=10K) and WAs (n=10K) in training samples of cases and controls by performing ML, equivalence mapping, topic modeling, and a support vector-machine (SVM) in structured and unstructured EHR data. Scores were validated via blinded manual chart reviews (n=1.2K) of controls from a separate sample (n=20K). AUCs and negative and positive predictive values (NPVs and PPVs) were calculated to evaluate the models. There was a strong positive relationship between SVM-generated risk scores and undiagnosed dementia. BAs were more likely than WAs to have undiagnosed dementia per chart review, both overall (15.3% vs 9.5%) and among Veterans with >90(th) percentile cutoff scores (25.6% vs 15.3%). With chart reviews as the reference standard and varied cutoff scores, the BA model performed slightly better than the WA model (AUC=0.86 with NPV=0.98 and PPV=0.26 at >90(th) percentile cutoff vs AUC=0.77 with NPV=0.98 and PPV=0.15 at >90(th)). The AUCs, NPVs, and PPVs suggest that race-specific ML models can assist in the identification of undiagnosed dementia, particularly in BAs. Future studies should investigate implementing EHR-based risk scores in clinics that serve both BA and WA Veterans. |
format | Online Article Text |
id | pubmed-9934793 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-99347932023-02-17 Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records Shao, Yijun Todd, Kaitlin Shutes-David, Andrew Millard, Steven P. Brown, Karl Thomas, Amy Chen, Kathryn Wilson, Katherine Zeng, Qing T. Tsuang, Debby W. medRxiv Article The application of machine learning (ML) tools in electronic health records (EHRs) can help reduce the underdiagnosis of dementia, but models that are not designed to reflect minority population may perpetuate that underdiagnosis. To address the underdiagnosis of dementia in both Black Americans (BAs) and white Americans (WAs), we sought to develop and validate ML models that assign race-specific risk scores. These scores were used to identify undiagnosed dementia in BA and WA Veterans in EHRs. More specifically, risk scores were generated separately for BAs (n=10K) and WAs (n=10K) in training samples of cases and controls by performing ML, equivalence mapping, topic modeling, and a support vector-machine (SVM) in structured and unstructured EHR data. Scores were validated via blinded manual chart reviews (n=1.2K) of controls from a separate sample (n=20K). AUCs and negative and positive predictive values (NPVs and PPVs) were calculated to evaluate the models. There was a strong positive relationship between SVM-generated risk scores and undiagnosed dementia. BAs were more likely than WAs to have undiagnosed dementia per chart review, both overall (15.3% vs 9.5%) and among Veterans with >90(th) percentile cutoff scores (25.6% vs 15.3%). With chart reviews as the reference standard and varied cutoff scores, the BA model performed slightly better than the WA model (AUC=0.86 with NPV=0.98 and PPV=0.26 at >90(th) percentile cutoff vs AUC=0.77 with NPV=0.98 and PPV=0.15 at >90(th)). The AUCs, NPVs, and PPVs suggest that race-specific ML models can assist in the identification of undiagnosed dementia, particularly in BAs. Future studies should investigate implementing EHR-based risk scores in clinics that serve both BA and WA Veterans. Cold Spring Harbor Laboratory 2023-02-14 /pmc/articles/PMC9934793/ /pubmed/36798376 http://dx.doi.org/10.1101/2023.02.08.23285540 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Shao, Yijun Todd, Kaitlin Shutes-David, Andrew Millard, Steven P. Brown, Karl Thomas, Amy Chen, Kathryn Wilson, Katherine Zeng, Qing T. Tsuang, Debby W. Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records |
title | Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records |
title_full | Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records |
title_fullStr | Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records |
title_full_unstemmed | Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records |
title_short | Identifying probable dementia in undiagnosed Black and White Americans using machine learning in Veterans Health Administration electronic health records |
title_sort | identifying probable dementia in undiagnosed black and white americans using machine learning in veterans health administration electronic health records |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9934793/ https://www.ncbi.nlm.nih.gov/pubmed/36798376 http://dx.doi.org/10.1101/2023.02.08.23285540 |
work_keys_str_mv | AT shaoyijun identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT toddkaitlin identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT shutesdavidandrew identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT millardstevenp identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT brownkarl identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT thomasamy identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT chenkathryn identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT wilsonkatherine identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT zengqingt identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords AT tsuangdebbyw identifyingprobabledementiainundiagnosedblackandwhiteamericansusingmachinelearninginveteranshealthadministrationelectronichealthrecords |