Cargando…
A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures
Recognition of Traditional Chinese Medicine (TCM) entities from different types of literature is challenging research, which is the foundation for extracting a large amount of TCM knowledge existing in unstructured texts into structured formats. The lack of large-scale annotated data makes unsatisfa...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9553443/ https://www.ncbi.nlm.nih.gov/pubmed/36248956 http://dx.doi.org/10.1155/2022/1495841 |
_version_ | 1784806472034549760 |
---|---|
author | Ma, Yuekun Liu, Yun Zhang, Dezheng Zhang, Jiye Liu, He Xie, Yonghong |
author_facet | Ma, Yuekun Liu, Yun Zhang, Dezheng Zhang, Jiye Liu, He Xie, Yonghong |
author_sort | Ma, Yuekun |
collection | PubMed |
description | Recognition of Traditional Chinese Medicine (TCM) entities from different types of literature is challenging research, which is the foundation for extracting a large amount of TCM knowledge existing in unstructured texts into structured formats. The lack of large-scale annotated data makes unsatisfactory application of conventional deep learning models in TCM text knowledge extraction. Some other unsupervised methods rely on other auxiliary data, such as domain dictionaries. We propose a multigranularity text-driven NER model based on Conditional Generation Adversarial Network (MT-CGAN) to implement TCM NER with small-scale annotated corpus. In the model, a multigranularity text features encoder (MTFE) is designed to extract rich semantic and grammatical information from multiple dimensions of TCM texts. By differentiating the conditional constraints of the generator and discriminator of MT-CGAN, the synchronization between the generated tag labs and the named entities is guaranteed. Furthermore, seeds of different TCM text types are introduced into our model to improve the precision of NER. We compare our method with other baseline methods to illustrate the effectiveness of our method on 4 kinds of gold-standard datasets. The experiment results show that the standard precision, recall, and F1 score of our method are higher than the state-of-the-art methods by 0.24∼8.97%, 0.89∼12.74%, and 0.01∼10.84%. MT-CGAN is able to extract entities from different types of TCM literature effectively. Our experimental results indicate that the proposed approach has a clear advantage in processing TCM texts with more entity types, higher sparsity, less regular features, and a small-scale corpus. |
format | Online Article Text |
id | pubmed-9553443 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-95534432022-10-13 A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures Ma, Yuekun Liu, Yun Zhang, Dezheng Zhang, Jiye Liu, He Xie, Yonghong Comput Intell Neurosci Research Article Recognition of Traditional Chinese Medicine (TCM) entities from different types of literature is challenging research, which is the foundation for extracting a large amount of TCM knowledge existing in unstructured texts into structured formats. The lack of large-scale annotated data makes unsatisfactory application of conventional deep learning models in TCM text knowledge extraction. Some other unsupervised methods rely on other auxiliary data, such as domain dictionaries. We propose a multigranularity text-driven NER model based on Conditional Generation Adversarial Network (MT-CGAN) to implement TCM NER with small-scale annotated corpus. In the model, a multigranularity text features encoder (MTFE) is designed to extract rich semantic and grammatical information from multiple dimensions of TCM texts. By differentiating the conditional constraints of the generator and discriminator of MT-CGAN, the synchronization between the generated tag labs and the named entities is guaranteed. Furthermore, seeds of different TCM text types are introduced into our model to improve the precision of NER. We compare our method with other baseline methods to illustrate the effectiveness of our method on 4 kinds of gold-standard datasets. The experiment results show that the standard precision, recall, and F1 score of our method are higher than the state-of-the-art methods by 0.24∼8.97%, 0.89∼12.74%, and 0.01∼10.84%. MT-CGAN is able to extract entities from different types of TCM literature effectively. Our experimental results indicate that the proposed approach has a clear advantage in processing TCM texts with more entity types, higher sparsity, less regular features, and a small-scale corpus. Hindawi 2022-09-24 /pmc/articles/PMC9553443/ /pubmed/36248956 http://dx.doi.org/10.1155/2022/1495841 Text en Copyright © 2022 Yuekun Ma et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Ma, Yuekun Liu, Yun Zhang, Dezheng Zhang, Jiye Liu, He Xie, Yonghong A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures |
title | A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures |
title_full | A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures |
title_fullStr | A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures |
title_full_unstemmed | A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures |
title_short | A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures |
title_sort | multigranularity text driven named entity recognition cgan model for traditional chinese medicine literatures |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9553443/ https://www.ncbi.nlm.nih.gov/pubmed/36248956 http://dx.doi.org/10.1155/2022/1495841 |
work_keys_str_mv | AT mayuekun amultigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT liuyun amultigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT zhangdezheng amultigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT zhangjiye amultigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT liuhe amultigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT xieyonghong amultigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT mayuekun multigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT liuyun multigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT zhangdezheng multigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT zhangjiye multigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT liuhe multigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures AT xieyonghong multigranularitytextdrivennamedentityrecognitioncganmodelfortraditionalchinesemedicineliteratures |