Cargando…
1493. Creation and External Validation of a Clinical Prediction Rule for Diarrheal Etiology Using Natural Language Processing
BACKGROUND: Infectious diarrheal illness is a significant contributor to healthcare costs in the US pediatric population. New multi-pathogen PCR-based panels have shown increased sensitivity over previous methods; however, they are costly and clinical utility may be limited in many cases. Clinical P...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6810629/ http://dx.doi.org/10.1093/ofid/ofz360.1357 |
Sumario: | BACKGROUND: Infectious diarrheal illness is a significant contributor to healthcare costs in the US pediatric population. New multi-pathogen PCR-based panels have shown increased sensitivity over previous methods; however, they are costly and clinical utility may be limited in many cases. Clinical Prediction Rules (CPRs) may help optimize the appropriate use of these tests. Furthermore, Natural Language Processing (NLP) is an emerging tool to extract clinical history for decision support. Here, we examine NLP for the validation of a CPR for pediatric diarrhea. METHODS: Using data from a prospective clinical trial at 5 US pediatric hospitals, 961 diarrheal cases were assessed for etiology and relevant clinical variables. Of 65 variables collected in that study, 42 were excluded in our models based on a scarcity of documentation in reviewed clinical charts. The remaining 23 variables were ranked by random forest (RF) variable importance and utilized in both an RF and stepwise logistic regression (LR) model for viral-only etiology. We investigated whether NLP could accurately extract data from clinical notes comparable to study questionnaires. We used the eHOST abstraction software to abstract 6 clinical variables from patient charts that were useful in our CPR. These data will be used to train an NLP algorithm to extract the same variables from additional charts, and be combined with data from 2 other variables coded in the EMR to externally validate our model. RESULTS: Both RF and LR models achieved cross-validated area under the receiver operating characteristic curves of 0.74 using the top 5 variables (season, age, bloody diarrhea, vomiting/nausea, and fever), which did not improve significantly with the addition of more variables. Of 270 charts abstracted for NLP training, there were 41 occurrences of bloody diarrhea annotated, 339 occurrences of vomiting, and 145 occurrences of fever. Inter-annotator agreement over 9 training sets ranged between 0.63 and 0.83. CONCLUSION: We have constructed a parsimonious CPR involving only 5 inputs for the prediction of a viral-only etiology for pediatric diarrheal illness using prospectively collected data. With the training of an NLP algorithm for automated chart abstraction we will validate the CPR. NLP could allow a CPR to run without manual data entry to improve care. [Image: see text] DISCLOSURES: All authors: No reported disclosures. |
---|