Cargando…
Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes
Rapid and automated extraction of clinical information from patients’ notes is a desirable though difficult task. Natural language processing (NLP) and machine learning have great potential to automate and accelerate such applications, but developing such models can require a large amount of labeled...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cold Spring Harbor Laboratory
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774225/ https://www.ncbi.nlm.nih.gov/pubmed/36561189 http://dx.doi.org/10.1101/2022.12.14.22283470 |
_version_ | 1784855355750088704 |
---|---|
author | Humbert-Droz, Marie Corley, Jessica Tamang, Suzanne Gevaert, Olivier |
author_facet | Humbert-Droz, Marie Corley, Jessica Tamang, Suzanne Gevaert, Olivier |
author_sort | Humbert-Droz, Marie |
collection | PubMed |
description | Rapid and automated extraction of clinical information from patients’ notes is a desirable though difficult task. Natural language processing (NLP) and machine learning have great potential to automate and accelerate such applications, but developing such models can require a large amount of labeled clinical text, which can be a slow and laborious process. To address this gap, we propose the MedDRA tagger, a fast annotation tool that makes use of industrial level libraries such as spaCy, biomedical ontologies and weak supervision to annotate and extract clinical concepts at scale. The tool can be used to annotate clinical text and obtain labels for training machine learning models and further refine the clinical concept extraction performance, or to extract clinical concepts for observational study purposes. To demonstrate the usability and versatility of our tool, we present three different use cases: we use the tagger to determine patients with a primary brain cancer diagnosis, we show evidence of rising mental health symptoms at the population level and our last use case shows the evolution of COVID-19 symptomatology throughout three waves between February 2020 and October 2021. The validation of our tool showed good performance on both specific annotations from our development set (F1 score 0.81) and open source annotated data set (F1 score 0.79). We successfully demonstrate the versatility of our pipeline with three different use cases. Finally, we note that the modular nature of our tool allows for a straightforward adaptation to another biomedical ontology. We also show that our tool is independent of EHR system, and as such generalizable. |
format | Online Article Text |
id | pubmed-9774225 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Cold Spring Harbor Laboratory |
record_format | MEDLINE/PubMed |
spelling | pubmed-97742252022-12-23 Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes Humbert-Droz, Marie Corley, Jessica Tamang, Suzanne Gevaert, Olivier medRxiv Article Rapid and automated extraction of clinical information from patients’ notes is a desirable though difficult task. Natural language processing (NLP) and machine learning have great potential to automate and accelerate such applications, but developing such models can require a large amount of labeled clinical text, which can be a slow and laborious process. To address this gap, we propose the MedDRA tagger, a fast annotation tool that makes use of industrial level libraries such as spaCy, biomedical ontologies and weak supervision to annotate and extract clinical concepts at scale. The tool can be used to annotate clinical text and obtain labels for training machine learning models and further refine the clinical concept extraction performance, or to extract clinical concepts for observational study purposes. To demonstrate the usability and versatility of our tool, we present three different use cases: we use the tagger to determine patients with a primary brain cancer diagnosis, we show evidence of rising mental health symptoms at the population level and our last use case shows the evolution of COVID-19 symptomatology throughout three waves between February 2020 and October 2021. The validation of our tool showed good performance on both specific annotations from our development set (F1 score 0.81) and open source annotated data set (F1 score 0.79). We successfully demonstrate the versatility of our pipeline with three different use cases. Finally, we note that the modular nature of our tool allows for a straightforward adaptation to another biomedical ontology. We also show that our tool is independent of EHR system, and as such generalizable. Cold Spring Harbor Laboratory 2022-12-14 /pmc/articles/PMC9774225/ /pubmed/36561189 http://dx.doi.org/10.1101/2022.12.14.22283470 Text en https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (https://creativecommons.org/licenses/by-nc-nd/4.0/) , which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
spellingShingle | Article Humbert-Droz, Marie Corley, Jessica Tamang, Suzanne Gevaert, Olivier Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes |
title | Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes |
title_full | Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes |
title_fullStr | Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes |
title_full_unstemmed | Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes |
title_short | Development and validation of MedDRA Tagger: a tool for extraction and structuring medical information from clinical notes |
title_sort | development and validation of meddra tagger: a tool for extraction and structuring medical information from clinical notes |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9774225/ https://www.ncbi.nlm.nih.gov/pubmed/36561189 http://dx.doi.org/10.1101/2022.12.14.22283470 |
work_keys_str_mv | AT humbertdrozmarie developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes AT corleyjessica developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes AT tamangsuzanne developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes AT gevaertolivier developmentandvalidationofmeddrataggeratoolforextractionandstructuringmedicalinformationfromclinicalnotes |