Cargando…

TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition

Intelligent traditional Chinese medicine (TCM) has become a popular research field by means of prospering of deep learning technology. Important achievements have been made in such representative tasks as automatic diagnosis of TCM syndromes and diseases and generation of TCM herbal prescriptions. H...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Zhi, Luo, Changyong, Zheng, Zeyu, Li, Yan, Fu, Dianzheng, Yu, Xinzhu, Zhao, Jiawei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369169/
https://www.ncbi.nlm.nih.gov/pubmed/34413968
http://dx.doi.org/10.1155/2021/3544281
_version_ 1783739235206955008
author Liu, Zhi
Luo, Changyong
Zheng, Zeyu
Li, Yan
Fu, Dianzheng
Yu, Xinzhu
Zhao, Jiawei
author_facet Liu, Zhi
Luo, Changyong
Zheng, Zeyu
Li, Yan
Fu, Dianzheng
Yu, Xinzhu
Zhao, Jiawei
author_sort Liu, Zhi
collection PubMed
description Intelligent traditional Chinese medicine (TCM) has become a popular research field by means of prospering of deep learning technology. Important achievements have been made in such representative tasks as automatic diagnosis of TCM syndromes and diseases and generation of TCM herbal prescriptions. However, one unavoidable issue that still hinders its progress is the lack of labeled samples, i.e., the TCM medical records. As an efficient tool, the named entity recognition (NER) models trained on various TCM resources can effectively alleviate this problem and continuously increase the labeled TCM samples. In this work, on the basis of in-depth analysis, we argue that the performance of the TCM named entity recognition model can be better by using the character-level representation and tagging and propose a novel word-character integrated self-attention module. With the help of TCM doctors and experts, we define 5 classes of TCM named entities and construct a comprehensive NER dataset containing the standard content of the publications and the clinical medical records. The experimental results on this dataset demonstrate the effectiveness of the proposed module.
format Online
Article
Text
id pubmed-8369169
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-83691692021-08-18 TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition Liu, Zhi Luo, Changyong Zheng, Zeyu Li, Yan Fu, Dianzheng Yu, Xinzhu Zhao, Jiawei J Healthc Eng Research Article Intelligent traditional Chinese medicine (TCM) has become a popular research field by means of prospering of deep learning technology. Important achievements have been made in such representative tasks as automatic diagnosis of TCM syndromes and diseases and generation of TCM herbal prescriptions. However, one unavoidable issue that still hinders its progress is the lack of labeled samples, i.e., the TCM medical records. As an efficient tool, the named entity recognition (NER) models trained on various TCM resources can effectively alleviate this problem and continuously increase the labeled TCM samples. In this work, on the basis of in-depth analysis, we argue that the performance of the TCM named entity recognition model can be better by using the character-level representation and tagging and propose a novel word-character integrated self-attention module. With the help of TCM doctors and experts, we define 5 classes of TCM named entities and construct a comprehensive NER dataset containing the standard content of the publications and the clinical medical records. The experimental results on this dataset demonstrate the effectiveness of the proposed module. Hindawi 2021-08-07 /pmc/articles/PMC8369169/ /pubmed/34413968 http://dx.doi.org/10.1155/2021/3544281 Text en Copyright © 2021 Zhi Liu et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Liu, Zhi
Luo, Changyong
Zheng, Zeyu
Li, Yan
Fu, Dianzheng
Yu, Xinzhu
Zhao, Jiawei
TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition
title TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition
title_full TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition
title_fullStr TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition
title_full_unstemmed TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition
title_short TCMNER and PubMed: A Novel Chinese Character-Level-Based Model and a Dataset for TCM Named Entity Recognition
title_sort tcmner and pubmed: a novel chinese character-level-based model and a dataset for tcm named entity recognition
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8369169/
https://www.ncbi.nlm.nih.gov/pubmed/34413968
http://dx.doi.org/10.1155/2021/3544281
work_keys_str_mv AT liuzhi tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition
AT luochangyong tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition
AT zhengzeyu tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition
AT liyan tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition
AT fudianzheng tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition
AT yuxinzhu tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition
AT zhaojiawei tcmnerandpubmedanovelchinesecharacterlevelbasedmodelandadatasetfortcmnamedentityrecognition