Cargando…
Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data
OBJECTIVE: Patient data repositories often assemble medication data from multiple sources, necessitating standardization prior to analysis. We implemented and evaluated a medication standardization procedure for use with a wide range of pharmacy data inputs across all drug categories, which supports...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10409892/ https://www.ncbi.nlm.nih.gov/pubmed/37565023 http://dx.doi.org/10.1093/jamiaopen/ooad062 |
_version_ | 1785086341927665664 |
---|---|
author | Waters, Riley Malecki, Sarah Lail, Sharan Mak, Denise Saha, Sudipta Jung, Hae Young Imrit, Mohammed Arshad Razak, Fahad Verma, Amol A |
author_facet | Waters, Riley Malecki, Sarah Lail, Sharan Mak, Denise Saha, Sudipta Jung, Hae Young Imrit, Mohammed Arshad Razak, Fahad Verma, Amol A |
author_sort | Waters, Riley |
collection | PubMed |
description | OBJECTIVE: Patient data repositories often assemble medication data from multiple sources, necessitating standardization prior to analysis. We implemented and evaluated a medication standardization procedure for use with a wide range of pharmacy data inputs across all drug categories, which supports research queries at multiple levels of granularity. METHODS: The GEMINI-RxNorm system automates the use of multiple RxNorm tools in tandem with other datasets to identify drug concepts from pharmacy orders. GEMINI-RxNorm was used to process 2 090 155 pharmacy orders from 245 258 hospitalizations between 2010 and 2017 at 7 hospitals in Ontario, Canada. The GEMINI-RxNorm system matches drug-identifying information from pharmacy data (including free-text fields) to RxNorm concept identifiers. A user interface allows researchers to search for drug terms and returns the relevant original pharmacy data through the matched RxNorm concepts. Users can then manually validate the predicted matches and discard false positives. We designed the system to maximize recall (sensitivity) and enable excellent precision (positive predictive value) with efficient manual validation. We compared the performance of this system to manual coding (by a physician and pharmacist) of 13 medication classes. RESULTS: Manual coding was performed for 1 948 817 pharmacy orders and GEMINI-RxNorm successfully returned 1 941 389 (99.6%) orders. Recall was greater than 0.985 in all 13 drug classes, and the F1-score and precision remained above 0.90 in all drug classes, facilitating efficient manual review to achieve 100% precision. GEMINI-RxNorm saved time substantially compared with manual standardization, reducing the time taken to review a pharmacy order row from an estimated 30 to 5 s and reducing the number of rows needed to be reviewed by up to 99.99%. DISCUSSION AND CONCLUSION: GEMINI-RxNorm presents a novel combination of RxNorm tools and other datasets to enable accurate, efficient, flexible, and scalable standardization of pharmacy data. By facilitating efficient manual validation, the GEMINI-RxNorm system can allow researchers to achieve near-perfect accuracy in medication data standardization. |
format | Online Article Text |
id | pubmed-10409892 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-104098922023-08-10 Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data Waters, Riley Malecki, Sarah Lail, Sharan Mak, Denise Saha, Sudipta Jung, Hae Young Imrit, Mohammed Arshad Razak, Fahad Verma, Amol A JAMIA Open Research and Applications OBJECTIVE: Patient data repositories often assemble medication data from multiple sources, necessitating standardization prior to analysis. We implemented and evaluated a medication standardization procedure for use with a wide range of pharmacy data inputs across all drug categories, which supports research queries at multiple levels of granularity. METHODS: The GEMINI-RxNorm system automates the use of multiple RxNorm tools in tandem with other datasets to identify drug concepts from pharmacy orders. GEMINI-RxNorm was used to process 2 090 155 pharmacy orders from 245 258 hospitalizations between 2010 and 2017 at 7 hospitals in Ontario, Canada. The GEMINI-RxNorm system matches drug-identifying information from pharmacy data (including free-text fields) to RxNorm concept identifiers. A user interface allows researchers to search for drug terms and returns the relevant original pharmacy data through the matched RxNorm concepts. Users can then manually validate the predicted matches and discard false positives. We designed the system to maximize recall (sensitivity) and enable excellent precision (positive predictive value) with efficient manual validation. We compared the performance of this system to manual coding (by a physician and pharmacist) of 13 medication classes. RESULTS: Manual coding was performed for 1 948 817 pharmacy orders and GEMINI-RxNorm successfully returned 1 941 389 (99.6%) orders. Recall was greater than 0.985 in all 13 drug classes, and the F1-score and precision remained above 0.90 in all drug classes, facilitating efficient manual review to achieve 100% precision. GEMINI-RxNorm saved time substantially compared with manual standardization, reducing the time taken to review a pharmacy order row from an estimated 30 to 5 s and reducing the number of rows needed to be reviewed by up to 99.99%. DISCUSSION AND CONCLUSION: GEMINI-RxNorm presents a novel combination of RxNorm tools and other datasets to enable accurate, efficient, flexible, and scalable standardization of pharmacy data. By facilitating efficient manual validation, the GEMINI-RxNorm system can allow researchers to achieve near-perfect accuracy in medication data standardization. Oxford University Press 2023-08-08 /pmc/articles/PMC10409892/ /pubmed/37565023 http://dx.doi.org/10.1093/jamiaopen/ooad062 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Waters, Riley Malecki, Sarah Lail, Sharan Mak, Denise Saha, Sudipta Jung, Hae Young Imrit, Mohammed Arshad Razak, Fahad Verma, Amol A Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data |
title | Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data |
title_full | Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data |
title_fullStr | Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data |
title_full_unstemmed | Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data |
title_short | Automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using RxNorm on GEMINI multicenter hospital data |
title_sort | automated identification of unstandardized medication data: a scalable and flexible data standardization pipeline using rxnorm on gemini multicenter hospital data |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10409892/ https://www.ncbi.nlm.nih.gov/pubmed/37565023 http://dx.doi.org/10.1093/jamiaopen/ooad062 |
work_keys_str_mv | AT watersriley automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT maleckisarah automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT lailsharan automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT makdenise automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT sahasudipta automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT junghaeyoung automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT imritmohammedarshad automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT razakfahad automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata AT vermaamola automatedidentificationofunstandardizedmedicationdataascalableandflexibledatastandardizationpipelineusingrxnormongeminimulticenterhospitaldata |