Cargando…

Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing

Allergic reactions to medication range from mild to severe or even life-threatening. Proper documentation of patient allergy information is critical for safe prescription, avoiding drug interactions, and reducing healthcare costs. Allergy information is regularly obtained during the medical intervie...

Descripción completa

Detalles Bibliográficos
Autores principales: Chaichulee, Sitthichok, Promchai, Chissanupong, Kaewkomon, Tanyamai, Kongkamol, Chanon, Ingviya, Thammasin, Sangsupawanich, Pasuree
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9352066/
https://www.ncbi.nlm.nih.gov/pubmed/35925971
http://dx.doi.org/10.1371/journal.pone.0270595
_version_ 1784762571252826112
author Chaichulee, Sitthichok
Promchai, Chissanupong
Kaewkomon, Tanyamai
Kongkamol, Chanon
Ingviya, Thammasin
Sangsupawanich, Pasuree
author_facet Chaichulee, Sitthichok
Promchai, Chissanupong
Kaewkomon, Tanyamai
Kongkamol, Chanon
Ingviya, Thammasin
Sangsupawanich, Pasuree
author_sort Chaichulee, Sitthichok
collection PubMed
description Allergic reactions to medication range from mild to severe or even life-threatening. Proper documentation of patient allergy information is critical for safe prescription, avoiding drug interactions, and reducing healthcare costs. Allergy information is regularly obtained during the medical interview, but is often poorly documented in electronic health records (EHRs). While many EHRs allow for structured adverse drug reaction (ADR) reporting, a free-text entry is still common. The resulting information is neither interoperable nor easily reusable for other applications, such as clinical decision support systems and prescription alerts. Current approaches require pharmacists to review and code ADRs documented by healthcare professionals. Recently, the effectiveness of machine algorithms in natural language processing (NLP) has been widely demonstrated. Our study aims to develop and evaluate different NLP algorithms that can encode unstructured ADRs stored in EHRs into institutional symptom terms. Our dataset consists of 79,712 pharmacist-reviewed drug allergy records. We evaluated three NLP techniques: Naive Bayes—Support Vector Machine (NB-SVM), Universal Language Model Fine-tuning (ULMFiT), and Bidirectional Encoder Representations from Transformers (BERT). We tested different general-domain pre-trained BERT models, including mBERT, XLM-RoBERTa, and WanchanBERTa, as well as our domain-specific AllergyRoBERTa, which was pre-trained from scratch on our corpus. Overall, BERT models had the highest performance. NB-SVM outperformed ULMFiT and BERT for several symptom terms that are not frequently coded. The ensemble model achieved an exact match ratio of 95.33%, a F(1) score of 98.88%, and a mean average precision of 97.07% for the 36 most frequently coded symptom terms. The model was then further developed into a symptom term suggestion system and achieved a Krippendorff’s alpha agreement coefficient of 0.7081 in prospective testing with pharmacists. Some degree of automation could both accelerate the availability of allergy information and reduce the efforts for human coding.
format Online
Article
Text
id pubmed-9352066
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-93520662022-08-05 Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing Chaichulee, Sitthichok Promchai, Chissanupong Kaewkomon, Tanyamai Kongkamol, Chanon Ingviya, Thammasin Sangsupawanich, Pasuree PLoS One Research Article Allergic reactions to medication range from mild to severe or even life-threatening. Proper documentation of patient allergy information is critical for safe prescription, avoiding drug interactions, and reducing healthcare costs. Allergy information is regularly obtained during the medical interview, but is often poorly documented in electronic health records (EHRs). While many EHRs allow for structured adverse drug reaction (ADR) reporting, a free-text entry is still common. The resulting information is neither interoperable nor easily reusable for other applications, such as clinical decision support systems and prescription alerts. Current approaches require pharmacists to review and code ADRs documented by healthcare professionals. Recently, the effectiveness of machine algorithms in natural language processing (NLP) has been widely demonstrated. Our study aims to develop and evaluate different NLP algorithms that can encode unstructured ADRs stored in EHRs into institutional symptom terms. Our dataset consists of 79,712 pharmacist-reviewed drug allergy records. We evaluated three NLP techniques: Naive Bayes—Support Vector Machine (NB-SVM), Universal Language Model Fine-tuning (ULMFiT), and Bidirectional Encoder Representations from Transformers (BERT). We tested different general-domain pre-trained BERT models, including mBERT, XLM-RoBERTa, and WanchanBERTa, as well as our domain-specific AllergyRoBERTa, which was pre-trained from scratch on our corpus. Overall, BERT models had the highest performance. NB-SVM outperformed ULMFiT and BERT for several symptom terms that are not frequently coded. The ensemble model achieved an exact match ratio of 95.33%, a F(1) score of 98.88%, and a mean average precision of 97.07% for the 36 most frequently coded symptom terms. The model was then further developed into a symptom term suggestion system and achieved a Krippendorff’s alpha agreement coefficient of 0.7081 in prospective testing with pharmacists. Some degree of automation could both accelerate the availability of allergy information and reduce the efforts for human coding. Public Library of Science 2022-08-04 /pmc/articles/PMC9352066/ /pubmed/35925971 http://dx.doi.org/10.1371/journal.pone.0270595 Text en © 2022 Chaichulee et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Chaichulee, Sitthichok
Promchai, Chissanupong
Kaewkomon, Tanyamai
Kongkamol, Chanon
Ingviya, Thammasin
Sangsupawanich, Pasuree
Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
title Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
title_full Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
title_fullStr Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
title_full_unstemmed Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
title_short Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
title_sort multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9352066/
https://www.ncbi.nlm.nih.gov/pubmed/35925971
http://dx.doi.org/10.1371/journal.pone.0270595
work_keys_str_mv AT chaichuleesitthichok multilabelclassificationofsymptomtermsfromfreetextbilingualadversedrugreactionreportsusingnaturallanguageprocessing
AT promchaichissanupong multilabelclassificationofsymptomtermsfromfreetextbilingualadversedrugreactionreportsusingnaturallanguageprocessing
AT kaewkomontanyamai multilabelclassificationofsymptomtermsfromfreetextbilingualadversedrugreactionreportsusingnaturallanguageprocessing
AT kongkamolchanon multilabelclassificationofsymptomtermsfromfreetextbilingualadversedrugreactionreportsusingnaturallanguageprocessing
AT ingviyathammasin multilabelclassificationofsymptomtermsfromfreetextbilingualadversedrugreactionreportsusingnaturallanguageprocessing
AT sangsupawanichpasuree multilabelclassificationofsymptomtermsfromfreetextbilingualadversedrugreactionreportsusingnaturallanguageprocessing