Cargando…

Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model

BACKGROUND: With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, i...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Hong, Ji, Jiatong, Tian, Haimei, Chen, Yao, Ge, Weihong, Zhang, Haixia, Yu, Feng, Zou, Jianjun, Nakamura, Mitsuhiro, Liao, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8686410/
https://www.ncbi.nlm.nih.gov/pubmed/34855616
http://dx.doi.org/10.2196/26407
_version_ 1784618013453975552
author Wu, Hong
Ji, Jiatong
Tian, Haimei
Chen, Yao
Ge, Weihong
Zhang, Haixia
Yu, Feng
Zou, Jianjun
Nakamura, Mitsuhiro
Liao, Jun
author_facet Wu, Hong
Ji, Jiatong
Tian, Haimei
Chen, Yao
Ge, Weihong
Zhang, Haixia
Yu, Feng
Zou, Jianjun
Nakamura, Mitsuhiro
Liao, Jun
author_sort Wu, Hong
collection PubMed
description BACKGROUND: With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, it is essential to make latent ADR information automatically available for better postmarketing drug safety reevaluation and pharmacovigilance. OBJECTIVE: This study describes how to identify ADR-related information from Chinese ADE reports. METHODS: Our study established an efficient automated tool, named BBC-Radical. BBC-Radical is a model that consists of 3 components: Bidirectional Encoder Representations from Transformers (BERT), bidirectional long short-term memory (bi-LSTM), and conditional random field (CRF). The model identifies ADR-related information from Chinese ADR reports. Token features and radical features of Chinese characters were used to represent the common meaning of a group of words. BERT and Bi-LSTM-CRF were novel models that combined these features to conduct named entity recognition (NER) tasks in the free-text section of 24,890 ADR reports from the Jiangsu Province Adverse Drug Reaction Monitoring Center from 2010 to 2016. Moreover, the man-machine comparison experiment on the ADE records from Drum Tower Hospital was designed to compare the NER performance between the BBC-Radical model and a manual method. RESULTS: The NER model achieved relatively high performance, with a precision of 96.4%, recall of 96.0%, and F1 score of 96.2%. This indicates that the performance of the BBC-Radical model (precision 87.2%, recall 85.7%, and F1 score 86.4%) is much better than that of the manual method (precision 86.1%, recall 73.8%, and F1 score 79.5%) in the recognition task of each kind of entity. CONCLUSIONS: The proposed model was competitive in extracting ADR-related information from ADE reports, and the results suggest that the application of our method to extract ADR-related information is of great significance in improving the quality of ADR reports and postmarketing drug safety evaluation.
format Online
Article
Text
id pubmed-8686410
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-86864102022-01-10 Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model Wu, Hong Ji, Jiatong Tian, Haimei Chen, Yao Ge, Weihong Zhang, Haixia Yu, Feng Zou, Jianjun Nakamura, Mitsuhiro Liao, Jun JMIR Med Inform Original Paper BACKGROUND: With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, it is essential to make latent ADR information automatically available for better postmarketing drug safety reevaluation and pharmacovigilance. OBJECTIVE: This study describes how to identify ADR-related information from Chinese ADE reports. METHODS: Our study established an efficient automated tool, named BBC-Radical. BBC-Radical is a model that consists of 3 components: Bidirectional Encoder Representations from Transformers (BERT), bidirectional long short-term memory (bi-LSTM), and conditional random field (CRF). The model identifies ADR-related information from Chinese ADR reports. Token features and radical features of Chinese characters were used to represent the common meaning of a group of words. BERT and Bi-LSTM-CRF were novel models that combined these features to conduct named entity recognition (NER) tasks in the free-text section of 24,890 ADR reports from the Jiangsu Province Adverse Drug Reaction Monitoring Center from 2010 to 2016. Moreover, the man-machine comparison experiment on the ADE records from Drum Tower Hospital was designed to compare the NER performance between the BBC-Radical model and a manual method. RESULTS: The NER model achieved relatively high performance, with a precision of 96.4%, recall of 96.0%, and F1 score of 96.2%. This indicates that the performance of the BBC-Radical model (precision 87.2%, recall 85.7%, and F1 score 86.4%) is much better than that of the manual method (precision 86.1%, recall 73.8%, and F1 score 79.5%) in the recognition task of each kind of entity. CONCLUSIONS: The proposed model was competitive in extracting ADR-related information from ADE reports, and the results suggest that the application of our method to extract ADR-related information is of great significance in improving the quality of ADR reports and postmarketing drug safety evaluation. JMIR Publications 2021-12-01 /pmc/articles/PMC8686410/ /pubmed/34855616 http://dx.doi.org/10.2196/26407 Text en ©Hong Wu, Jiatong Ji, Haimei Tian, Yao Chen, Weihong Ge, Haixia Zhang, Feng Yu, Jianjun Zou, Mitsuhiro Nakamura, Jun Liao. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 01.12.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Wu, Hong
Ji, Jiatong
Tian, Haimei
Chen, Yao
Ge, Weihong
Zhang, Haixia
Yu, Feng
Zou, Jianjun
Nakamura, Mitsuhiro
Liao, Jun
Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
title Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
title_full Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
title_fullStr Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
title_full_unstemmed Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
title_short Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
title_sort chinese-named entity recognition from adverse drug event records: radical embedding-combined dynamic embedding–based bert in a bidirectional long short-term conditional random field (bi-lstm-crf) model
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8686410/
https://www.ncbi.nlm.nih.gov/pubmed/34855616
http://dx.doi.org/10.2196/26407
work_keys_str_mv AT wuhong chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT jijiatong chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT tianhaimei chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT chenyao chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT geweihong chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT zhanghaixia chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT yufeng chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT zoujianjun chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT nakamuramitsuhiro chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel
AT liaojun chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel