Cargando…
Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model
BACKGROUND: With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, i...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8686410/ https://www.ncbi.nlm.nih.gov/pubmed/34855616 http://dx.doi.org/10.2196/26407 |
_version_ | 1784618013453975552 |
---|---|
author | Wu, Hong Ji, Jiatong Tian, Haimei Chen, Yao Ge, Weihong Zhang, Haixia Yu, Feng Zou, Jianjun Nakamura, Mitsuhiro Liao, Jun |
author_facet | Wu, Hong Ji, Jiatong Tian, Haimei Chen, Yao Ge, Weihong Zhang, Haixia Yu, Feng Zou, Jianjun Nakamura, Mitsuhiro Liao, Jun |
author_sort | Wu, Hong |
collection | PubMed |
description | BACKGROUND: With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, it is essential to make latent ADR information automatically available for better postmarketing drug safety reevaluation and pharmacovigilance. OBJECTIVE: This study describes how to identify ADR-related information from Chinese ADE reports. METHODS: Our study established an efficient automated tool, named BBC-Radical. BBC-Radical is a model that consists of 3 components: Bidirectional Encoder Representations from Transformers (BERT), bidirectional long short-term memory (bi-LSTM), and conditional random field (CRF). The model identifies ADR-related information from Chinese ADR reports. Token features and radical features of Chinese characters were used to represent the common meaning of a group of words. BERT and Bi-LSTM-CRF were novel models that combined these features to conduct named entity recognition (NER) tasks in the free-text section of 24,890 ADR reports from the Jiangsu Province Adverse Drug Reaction Monitoring Center from 2010 to 2016. Moreover, the man-machine comparison experiment on the ADE records from Drum Tower Hospital was designed to compare the NER performance between the BBC-Radical model and a manual method. RESULTS: The NER model achieved relatively high performance, with a precision of 96.4%, recall of 96.0%, and F1 score of 96.2%. This indicates that the performance of the BBC-Radical model (precision 87.2%, recall 85.7%, and F1 score 86.4%) is much better than that of the manual method (precision 86.1%, recall 73.8%, and F1 score 79.5%) in the recognition task of each kind of entity. CONCLUSIONS: The proposed model was competitive in extracting ADR-related information from ADE reports, and the results suggest that the application of our method to extract ADR-related information is of great significance in improving the quality of ADR reports and postmarketing drug safety evaluation. |
format | Online Article Text |
id | pubmed-8686410 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-86864102022-01-10 Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model Wu, Hong Ji, Jiatong Tian, Haimei Chen, Yao Ge, Weihong Zhang, Haixia Yu, Feng Zou, Jianjun Nakamura, Mitsuhiro Liao, Jun JMIR Med Inform Original Paper BACKGROUND: With the increasing variety of drugs, the incidence of adverse drug events (ADEs) is increasing year by year. Massive numbers of ADEs are recorded in electronic medical records and adverse drug reaction (ADR) reports, which are important sources of potential ADR information. Meanwhile, it is essential to make latent ADR information automatically available for better postmarketing drug safety reevaluation and pharmacovigilance. OBJECTIVE: This study describes how to identify ADR-related information from Chinese ADE reports. METHODS: Our study established an efficient automated tool, named BBC-Radical. BBC-Radical is a model that consists of 3 components: Bidirectional Encoder Representations from Transformers (BERT), bidirectional long short-term memory (bi-LSTM), and conditional random field (CRF). The model identifies ADR-related information from Chinese ADR reports. Token features and radical features of Chinese characters were used to represent the common meaning of a group of words. BERT and Bi-LSTM-CRF were novel models that combined these features to conduct named entity recognition (NER) tasks in the free-text section of 24,890 ADR reports from the Jiangsu Province Adverse Drug Reaction Monitoring Center from 2010 to 2016. Moreover, the man-machine comparison experiment on the ADE records from Drum Tower Hospital was designed to compare the NER performance between the BBC-Radical model and a manual method. RESULTS: The NER model achieved relatively high performance, with a precision of 96.4%, recall of 96.0%, and F1 score of 96.2%. This indicates that the performance of the BBC-Radical model (precision 87.2%, recall 85.7%, and F1 score 86.4%) is much better than that of the manual method (precision 86.1%, recall 73.8%, and F1 score 79.5%) in the recognition task of each kind of entity. CONCLUSIONS: The proposed model was competitive in extracting ADR-related information from ADE reports, and the results suggest that the application of our method to extract ADR-related information is of great significance in improving the quality of ADR reports and postmarketing drug safety evaluation. JMIR Publications 2021-12-01 /pmc/articles/PMC8686410/ /pubmed/34855616 http://dx.doi.org/10.2196/26407 Text en ©Hong Wu, Jiatong Ji, Haimei Tian, Yao Chen, Weihong Ge, Haixia Zhang, Feng Yu, Jianjun Zou, Mitsuhiro Nakamura, Jun Liao. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 01.12.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Wu, Hong Ji, Jiatong Tian, Haimei Chen, Yao Ge, Weihong Zhang, Haixia Yu, Feng Zou, Jianjun Nakamura, Mitsuhiro Liao, Jun Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model |
title | Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model |
title_full | Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model |
title_fullStr | Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model |
title_full_unstemmed | Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model |
title_short | Chinese-Named Entity Recognition From Adverse Drug Event Records: Radical Embedding-Combined Dynamic Embedding–Based BERT in a Bidirectional Long Short-term Conditional Random Field (Bi-LSTM-CRF) Model |
title_sort | chinese-named entity recognition from adverse drug event records: radical embedding-combined dynamic embedding–based bert in a bidirectional long short-term conditional random field (bi-lstm-crf) model |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8686410/ https://www.ncbi.nlm.nih.gov/pubmed/34855616 http://dx.doi.org/10.2196/26407 |
work_keys_str_mv | AT wuhong chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT jijiatong chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT tianhaimei chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT chenyao chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT geweihong chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT zhanghaixia chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT yufeng chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT zoujianjun chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT nakamuramitsuhiro chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel AT liaojun chinesenamedentityrecognitionfromadversedrugeventrecordsradicalembeddingcombineddynamicembeddingbasedbertinabidirectionallongshorttermconditionalrandomfieldbilstmcrfmodel |