Cargando…
Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning
We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or i...
Autores principales: | , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9896464/ https://www.ncbi.nlm.nih.gov/pubmed/36331289 http://dx.doi.org/10.1093/aje/kwac182 |
_version_ | 1784882057120317440 |
---|---|
author | Carrell, David S Gruber, Susan Floyd, James S Bann, Maralyssa A Cushing-Haugen, Kara L Johnson, Ron L Graham, Vina Cronkite, David J Hazlehurst, Brian L Felcher, Andrew H Bejan, Cosmin A Kennedy, Adee Shinde, Mayura U Karami, Sara Ma, Yong Stojanovic, Danijela Zhao, Yueqin Ball, Robert Nelson, Jennifer C |
author_facet | Carrell, David S Gruber, Susan Floyd, James S Bann, Maralyssa A Cushing-Haugen, Kara L Johnson, Ron L Graham, Vina Cronkite, David J Hazlehurst, Brian L Felcher, Andrew H Bejan, Cosmin A Kennedy, Adee Shinde, Mayura U Karami, Sara Ma, Yong Stojanovic, Danijela Zhao, Yueqin Ball, Robert Nelson, Jennifer C |
author_sort | Carrell, David S |
collection | PubMed |
description | We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015–2019 in 2 integrated health-care institutions in the Northwest United States. We used one site’s manually reviewed gold-standard outcomes data for model development and the other’s for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events. |
format | Online Article Text |
id | pubmed-9896464 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-98964642023-02-06 Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning Carrell, David S Gruber, Susan Floyd, James S Bann, Maralyssa A Cushing-Haugen, Kara L Johnson, Ron L Graham, Vina Cronkite, David J Hazlehurst, Brian L Felcher, Andrew H Bejan, Cosmin A Kennedy, Adee Shinde, Mayura U Karami, Sara Ma, Yong Stojanovic, Danijela Zhao, Yueqin Ball, Robert Nelson, Jennifer C Am J Epidemiol Practice of Epidemiology We sought to determine whether machine learning and natural language processing (NLP) applied to electronic medical records could improve performance of automated health-care claims-based algorithms to identify anaphylaxis events using data on 516 patients with outpatient, emergency department, or inpatient anaphylaxis diagnosis codes during 2015–2019 in 2 integrated health-care institutions in the Northwest United States. We used one site’s manually reviewed gold-standard outcomes data for model development and the other’s for external validation based on cross-validated area under the receiver operating characteristic curve (AUC), positive predictive value (PPV), and sensitivity. In the development site 154 (64%) of 239 potential events met adjudication criteria for anaphylaxis compared with 180 (65%) of 277 in the validation site. Logistic regression models using only structured claims data achieved a cross-validated AUC of 0.58 (95% CI: 0.54, 0.63). Machine learning improved cross-validated AUC to 0.62 (0.58, 0.66); incorporating NLP-derived covariates further increased cross-validated AUCs to 0.70 (0.66, 0.75) in development and 0.67 (0.63, 0.71) in external validation data. A classification threshold with cross-validated PPV of 79% and cross-validated sensitivity of 66% in development data had cross-validated PPV of 78% and cross-validated sensitivity of 56% in external data. Machine learning and NLP-derived data improved identification of validated anaphylaxis events. Oxford University Press 2022-11-04 /pmc/articles/PMC9896464/ /pubmed/36331289 http://dx.doi.org/10.1093/aje/kwac182 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Practice of Epidemiology Carrell, David S Gruber, Susan Floyd, James S Bann, Maralyssa A Cushing-Haugen, Kara L Johnson, Ron L Graham, Vina Cronkite, David J Hazlehurst, Brian L Felcher, Andrew H Bejan, Cosmin A Kennedy, Adee Shinde, Mayura U Karami, Sara Ma, Yong Stojanovic, Danijela Zhao, Yueqin Ball, Robert Nelson, Jennifer C Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning |
title | Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning |
title_full | Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning |
title_fullStr | Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning |
title_full_unstemmed | Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning |
title_short | Improving Methods of Identifying Anaphylaxis for Medical Product Safety Surveillance Using Natural Language Processing and Machine Learning |
title_sort | improving methods of identifying anaphylaxis for medical product safety surveillance using natural language processing and machine learning |
topic | Practice of Epidemiology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9896464/ https://www.ncbi.nlm.nih.gov/pubmed/36331289 http://dx.doi.org/10.1093/aje/kwac182 |
work_keys_str_mv | AT carrelldavids improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT grubersusan improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT floydjamess improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT bannmaralyssaa improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT cushinghaugenkaral improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT johnsonronl improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT grahamvina improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT cronkitedavidj improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT hazlehurstbrianl improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT felcherandrewh improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT bejancosmina improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT kennedyadee improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT shindemayurau improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT karamisara improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT mayong improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT stojanovicdanijela improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT zhaoyueqin improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT ballrobert improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning AT nelsonjenniferc improvingmethodsofidentifyinganaphylaxisformedicalproductsafetysurveillanceusingnaturallanguageprocessingandmachinelearning |