Cargando…
4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records
OBJECTIVES/GOALS: Diverse medication-based studies require longitudinal drug dose information. EHRs can provide such data, but multiple mentions of a drug in the same clinical note can yield conflicting dose. We aimed to develop statistical methods which address this challenge by predicting the vali...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cambridge University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823233/ http://dx.doi.org/10.1017/cts.2020.185 |
_version_ | 1784646759779139584 |
---|---|
author | Williams, Michael Lee Weeks, Hannah L Beck, Cole McNeer, Elizabeth Choi, Leena |
author_facet | Williams, Michael Lee Weeks, Hannah L Beck, Cole McNeer, Elizabeth Choi, Leena |
author_sort | Williams, Michael Lee |
collection | PubMed |
description | OBJECTIVES/GOALS: Diverse medication-based studies require longitudinal drug dose information. EHRs can provide such data, but multiple mentions of a drug in the same clinical note can yield conflicting dose. We aimed to develop statistical methods which address this challenge by predicting the valid dose in the event that conflicting doses are extracted. METHODS/STUDY POPULATION: We extracted dose information for two test drugs, tacrolimus and lamotrigine, from Vanderbilt EHRs using a natural language processing system, medExtractR, which was developed by our team. A random forest classifier was used to estimate the probability of correctness for each extracted dose on the basis of subject longitudinal dosing patterns and extracted EHR note context. Using this feasibility measure and other features such as a summary of subject dosing history, we developed several statistical models to predict the dose on the basis of the extracted doses. The models developed based on supervised methods included a separate random forest regression, a transition model, and a boosting model. We also considered unsupervised methods and developed a Bayesian hierarchical model. RESULTS/ANTICIPATED RESULTS: We compared model-predicted doses to physician-validated doses to evaluate model performance. A random forest regression model outperformed all proposed models. As this model is a supervised model, its utility would depend on availability of validated dose. Our preliminary result from a Bayesian hierarchical model showed that it can be a promising alternative although performing less optimally. The Bayesian hierarchical model would be especially useful when validated dose data are not available, as it was developed in unsupervised modeling framework and hence does not require validated dose that can be difficult and time consuming to obtain. We evaluated the feasibility of each method for automatic implementation in our drug dosing extraction and processing system we have been developing. DISCUSSION/SIGNIFICANCE OF IMPACT: We will incorporate the developed methods as a part of our complete medication extraction system, which will allow to automatically prepare large longitudinal medication dose datasets for researchers. Availability of such data will enable diverse medication-based studies with drastically reduced barriers to data collection. |
format | Online Article Text |
id | pubmed-8823233 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Cambridge University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-88232332022-02-18 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records Williams, Michael Lee Weeks, Hannah L Beck, Cole McNeer, Elizabeth Choi, Leena J Clin Transl Sci Data Science/Biostatistics/Informatics OBJECTIVES/GOALS: Diverse medication-based studies require longitudinal drug dose information. EHRs can provide such data, but multiple mentions of a drug in the same clinical note can yield conflicting dose. We aimed to develop statistical methods which address this challenge by predicting the valid dose in the event that conflicting doses are extracted. METHODS/STUDY POPULATION: We extracted dose information for two test drugs, tacrolimus and lamotrigine, from Vanderbilt EHRs using a natural language processing system, medExtractR, which was developed by our team. A random forest classifier was used to estimate the probability of correctness for each extracted dose on the basis of subject longitudinal dosing patterns and extracted EHR note context. Using this feasibility measure and other features such as a summary of subject dosing history, we developed several statistical models to predict the dose on the basis of the extracted doses. The models developed based on supervised methods included a separate random forest regression, a transition model, and a boosting model. We also considered unsupervised methods and developed a Bayesian hierarchical model. RESULTS/ANTICIPATED RESULTS: We compared model-predicted doses to physician-validated doses to evaluate model performance. A random forest regression model outperformed all proposed models. As this model is a supervised model, its utility would depend on availability of validated dose. Our preliminary result from a Bayesian hierarchical model showed that it can be a promising alternative although performing less optimally. The Bayesian hierarchical model would be especially useful when validated dose data are not available, as it was developed in unsupervised modeling framework and hence does not require validated dose that can be difficult and time consuming to obtain. We evaluated the feasibility of each method for automatic implementation in our drug dosing extraction and processing system we have been developing. DISCUSSION/SIGNIFICANCE OF IMPACT: We will incorporate the developed methods as a part of our complete medication extraction system, which will allow to automatically prepare large longitudinal medication dose datasets for researchers. Availability of such data will enable diverse medication-based studies with drastically reduced barriers to data collection. Cambridge University Press 2020-07-29 /pmc/articles/PMC8823233/ http://dx.doi.org/10.1017/cts.2020.185 Text en © The Association for Clinical and Translational Science 2020 https://creativecommons.org/licenses/by/4.0/This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Data Science/Biostatistics/Informatics Williams, Michael Lee Weeks, Hannah L Beck, Cole McNeer, Elizabeth Choi, Leena 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records |
title | 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records |
title_full | 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records |
title_fullStr | 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records |
title_full_unstemmed | 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records |
title_short | 4501 Statistical Modeling for Predicting Correct Drug Dose in the Presence of Conflicting Dose Information Extracted from Electronic Health Records |
title_sort | 4501 statistical modeling for predicting correct drug dose in the presence of conflicting dose information extracted from electronic health records |
topic | Data Science/Biostatistics/Informatics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8823233/ http://dx.doi.org/10.1017/cts.2020.185 |
work_keys_str_mv | AT williamsmichaellee 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords AT weekshannahl 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords AT beckcole 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords AT mcneerelizabeth 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords AT choileena 4501statisticalmodelingforpredictingcorrectdrugdoseinthepresenceofconflictingdoseinformationextractedfromelectronichealthrecords |