Cargando…

4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records

OBJECTIVES/GOALS: Diverse medication-based studies require longitudinal drug dose information. EHRs can provide such data, but multiple mentions of a drug in the same clinical note can yield conflicting dose. We aimed to develop statistical methods which address this challenge by predicting the vali...

Descripción completa

Detalles Bibliográficos
Autores principales: Williams, Michael Lee, Weeks, Hannah L, Beck, Cole, McNeer, Elizabeth, Choi, Leena
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cambridge University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823233/
http://dx.doi.org/10.1017/cts.2020.185
_version_ 1784646759779139584
author Williams, Michael Lee
Weeks, Hannah L
Beck, Cole
McNeer, Elizabeth
Choi, Leena
author_facet Williams, Michael Lee
Weeks, Hannah L
Beck, Cole
McNeer, Elizabeth
Choi, Leena
author_sort Williams, Michael Lee
collection PubMed
description OBJECTIVES/GOALS: Diverse medication-based studies require longitudinal drug dose information. EHRs can provide such data, but multiple mentions of a drug in the same clinical note can yield conflicting dose. We aimed to develop statistical methods which address this challenge by predicting the valid dose in the event that conflicting doses are extracted. METHODS/STUDY POPULATION: We extracted dose information for two test drugs, tacrolimus and lamotrigine, from Vanderbilt EHRs using a natural language processing system, medExtractR, which was developed by our team. A random forest classifier was used to estimate the probability of correctness for each extracted dose on the basis of subject longitudinal dosing patterns and extracted EHR note context. Using this feasibility measure and other features such as a summary of subject dosing history, we developed several statistical models to predict the dose on the basis of the extracted doses. The models developed based on supervised methods included a separate random forest regression, a transition model, and a boosting model. We also considered unsupervised methods and developed a Bayesian hierarchical model. RESULTS/ANTICIPATED RESULTS: We compared model-predicted doses to physician-validated doses to evaluate model performance. A random forest regression model outperformed all proposed models. As this model is a supervised model, its utility would depend on availability of validated dose. Our preliminary result from a Bayesian hierarchical model showed that it can be a promising alternative although performing less optimally. The Bayesian hierarchical model would be especially useful when validated dose data are not available, as it was developed in unsupervised modeling framework and hence does not require validated dose that can be difficult and time consuming to obtain. We evaluated the feasibility of each method for automatic implementation in our drug dosing extraction and processing system we have been developing. DISCUSSION/SIGNIFICANCE OF IMPACT: We will incorporate the developed methods as a part of our complete medication extraction system, which will allow to automatically prepare large longitudinal medication dose datasets for researchers. Availability of such data will enable diverse medication-based studies with drastically reduced barriers to data collection.
format Online
Article
Text
id pubmed-8823233
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Cambridge University Press
record_format MEDLINE/PubMed
spelling pubmed-88232332022-02-18 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records Williams, Michael Lee Weeks, Hannah L Beck, Cole McNeer, Elizabeth Choi, Leena J Clin Transl Sci Data Science/Biostatistics/Informatics OBJECTIVES/GOALS: Diverse medication-based studies require longitudinal drug dose information. EHRs can provide such data, but multiple mentions of a drug in the same clinical note can yield conflicting dose. We aimed to develop statistical methods which address this challenge by predicting the valid dose in the event that conflicting doses are extracted. METHODS/STUDY POPULATION: We extracted dose information for two test drugs, tacrolimus and lamotrigine, from Vanderbilt EHRs using a natural language processing system, medExtractR, which was developed by our team. A random forest classifier was used to estimate the probability of correctness for each extracted dose on the basis of subject longitudinal dosing patterns and extracted EHR note context. Using this feasibility measure and other features such as a summary of subject dosing history, we developed several statistical models to predict the dose on the basis of the extracted doses. The models developed based on supervised methods included a separate random forest regression, a transition model, and a boosting model. We also considered unsupervised methods and developed a Bayesian hierarchical model. RESULTS/ANTICIPATED RESULTS: We compared model-predicted doses to physician-validated doses to evaluate model performance. A random forest regression model outperformed all proposed models. As this model is a supervised model, its utility would depend on availability of validated dose. Our preliminary result from a Bayesian hierarchical model showed that it can be a promising alternative although performing less optimally. The Bayesian hierarchical model would be especially useful when validated dose data are not available, as it was developed in unsupervised modeling framework and hence does not require validated dose that can be difficult and time consuming to obtain. We evaluated the feasibility of each method for automatic implementation in our drug dosing extraction and processing system we have been developing. DISCUSSION/SIGNIFICANCE OF IMPACT: We will incorporate the developed methods as a part of our complete medication extraction system, which will allow to automatically prepare large longitudinal medication dose datasets for researchers. Availability of such data will enable diverse medication-based studies with drastically reduced barriers to data collection. Cambridge University Press 2020-07-29 /pmc/articles/PMC8823233/ http://dx.doi.org/10.1017/cts.2020.185 Text en © The Association for Clinical and Translational Science 2020 https://creativecommons.org/licenses/by/4.0/This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Data Science/Biostatistics/Informatics
Williams, Michael Lee
Weeks, Hannah L
Beck, Cole
McNeer, Elizabeth
Choi, Leena
4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
title 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
title_full 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
title_fullStr 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
title_full_unstemmed 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
title_short 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
title_sort 4501 statistical modeling for predicting correct drug dose in the presence of conflicting dose information extracted from electronic health records
topic Data Science/Biostatistics/Informatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823233/
http://dx.doi.org/10.1017/cts.2020.185
work_keys_str_mv AT williamsmichaellee 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords
AT weekshannahl 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords
AT beckcole 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords
AT mcneerelizabeth 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords
AT choileena 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords