Cargando…

Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes

Rapid and automated extraction of clinical information from patients’ notes is a desirable though difficult task. Natural language processing (NLP) and machine learning have great potential to automate and accelerate such applications, but developing such models can require a large amount of labeled...

Descripción completa

Detalles Bibliográficos
Autores principales: Humbert-Droz, Marie, Corley, Jessica, Tamang, Suzanne, Gevaert, Olivier
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Cold Spring Harbor Laboratory 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774225/
https://www.ncbi.nlm.nih.gov/pubmed/36561189
http://dx.doi.org/10.1101/2022.12.14.22283470
_version_ 1784855355750088704
author Humbert-Droz, Marie
Corley, Jessica
Tamang, Suzanne
Gevaert, Olivier
author_facet Humbert-Droz, Marie
Corley, Jessica
Tamang, Suzanne
Gevaert, Olivier
author_sort Humbert-Droz, Marie
collection PubMed
description Rapid and automated extraction of clinical information from patients’ notes is a desirable though difficult task. Natural language processing (NLP) and machine learning have great potential to automate and accelerate such applications, but developing such models can require a large amount of labeled clinical text, which can be a slow and laborious process. To address this gap, we propose the MedDRA tagger, a fast annotation tool that makes use of industrial level libraries such as spaCy, biomedical ontologies and weak supervision to annotate and extract clinical concepts at scale. The tool can be used to annotate clinical text and obtain labels for training machine learning models and further refine the clinical concept extraction performance, or to extract clinical concepts for observational study purposes. To demonstrate the usability and versatility of our tool, we present three different use cases: we use the tagger to determine patients with a primary brain cancer diagnosis, we show evidence of rising mental health symptoms at the population level and our last use case shows the evolution of COVID-19 symptomatology throughout three waves between February 2020 and October 2021. The validation of our tool showed good performance on both specific annotations from our development set (F1 score 0.81) and open source annotated data set (F1 score 0.79). We successfully demonstrate the versatility of our pipeline with three different use cases. Finally, we note that the modular nature of our tool allows for a straightforward adaptation to another biomedical ontology. We also show that our tool is independent of EHR system, and as such generalizable.
format Online
Article
Text
id pubmed-9774225
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Cold Spring Harbor Laboratory
record_format MEDLINE/PubMed
spelling pubmed-97742252022-12-23 Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes Humbert-Droz, Marie Corley, Jessica Tamang, Suzanne Gevaert, Olivier medRxiv Article Rapid and automated extraction of clinical information from patients’ notes is a desirable though difficult task. Natural language processing (NLP) and machine learning have great potential to automate and accelerate such applications, but developing such models can require a large amount of labeled clinical text, which can be a slow and laborious process. To address this gap, we propose the MedDRA tagger, a fast annotation tool that makes use of industrial level libraries such as spaCy, biomedical ontologies and weak supervision to annotate and extract clinical concepts at scale. The tool can be used to annotate clinical text and obtain labels for training machine learning models and further refine the clinical concept extraction performance, or to extract clinical concepts for observational study purposes. To demonstrate the usability and versatility of our tool, we present three different use cases: we use the tagger to determine patients with a primary brain cancer diagnosis, we show evidence of rising mental health symptoms at the population level and our last use case shows the evolution of COVID-19 symptomatology throughout three waves between February 2020 and October 2021. The validation of our tool showed good performance on both specific annotations from our development set (F1 score 0.81) and open source annotated data set (F1 score 0.79). We successfully demonstrate the versatility of our pipeline with three different use cases. Finally, we note that the modular nature of our tool allows for a straightforward adaptation to another biomedical ontology. We also show that our tool is independent of EHR system, and as such generalizable. Cold Spring Harbor Laboratory 2022-12-14 /pmc/articles/PMC9774225/ /pubmed/36561189 http://dx.doi.org/10.1101/2022.12.14.22283470 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
spellingShingle Article
Humbert-Droz, Marie
Corley, Jessica
Tamang, Suzanne
Gevaert, Olivier
Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
title Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
title_full Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
title_fullStr Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
title_full_unstemmed Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
title_short Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
title_sort development and validation of meddra tagger: a tool for extraction and structuring medical information from clinical notes
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774225/
https://www.ncbi.nlm.nih.gov/pubmed/36561189
http://dx.doi.org/10.1101/2022.12.14.22283470
work_keys_str_mv AT humbertdrozmarie developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes
AT corleyjessica developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes
AT tamangsuzanne developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes
AT gevaertolivier developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes