Cargando…

Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network

The text similarity calculation plays a crucial role as the core work of artificial intelligence commercial applications such as traditional Chinese medicine (TCM) auxiliary diagnosis, intelligent question and answer, and prescription recommendation. However, TCM texts have problems such as short se...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Jigen, Xiong, Wangping, Du, Jianqiang, Liu, Yingfeng, Li, Jianwen, Hu, Dingxing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8648447/
https://www.ncbi.nlm.nih.gov/pubmed/34880918
http://dx.doi.org/10.1155/2021/2337924
_version_ 1784610806183231488
author Luo, Jigen
Xiong, Wangping
Du, Jianqiang
Liu, Yingfeng
Li, Jianwen
Hu, Dingxing
author_facet Luo, Jigen
Xiong, Wangping
Du, Jianqiang
Liu, Yingfeng
Li, Jianwen
Hu, Dingxing
author_sort Luo, Jigen
collection PubMed
description The text similarity calculation plays a crucial role as the core work of artificial intelligence commercial applications such as traditional Chinese medicine (TCM) auxiliary diagnosis, intelligent question and answer, and prescription recommendation. However, TCM texts have problems such as short sentence expression, inaccurate word segmentation, strong semantic relevance, high feature dimension, and sparseness. This study comprehensively considers the temporal information of sentence context and proposes a TCM text similarity calculation model based on the bidirectional temporal Siamese network (BTSN). We used the enhanced representation through knowledge integration (ERNIE) pretrained language model to train character vectors instead of word vectors and solved the problem of inaccurate word segmentation in TCM. In the Siamese network, the traditional fully connected neural network was replaced by a deep bidirectional long short-term memory (BLSTM) to capture the contextual semantics of the current word information. The improved similarity BLSTM was used to map the sentence that is to be tested into two sets of low-dimensional numerical vectors. Then, we performed similarity calculation training. Experiments on the two datasets of financial and TCM show that the performance of the BTSN model in this study was better than that of other similarity calculation models. When the number of layers of the BLSTM reached 6 layers, the accuracy of the model was the highest. This verifies that the text similarity calculation model proposed in this study has high engineering value.
format Online
Article
Text
id pubmed-8648447
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-86484472021-12-07 Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network Luo, Jigen Xiong, Wangping Du, Jianqiang Liu, Yingfeng Li, Jianwen Hu, Dingxing Evid Based Complement Alternat Med Research Article The text similarity calculation plays a crucial role as the core work of artificial intelligence commercial applications such as traditional Chinese medicine (TCM) auxiliary diagnosis, intelligent question and answer, and prescription recommendation. However, TCM texts have problems such as short sentence expression, inaccurate word segmentation, strong semantic relevance, high feature dimension, and sparseness. This study comprehensively considers the temporal information of sentence context and proposes a TCM text similarity calculation model based on the bidirectional temporal Siamese network (BTSN). We used the enhanced representation through knowledge integration (ERNIE) pretrained language model to train character vectors instead of word vectors and solved the problem of inaccurate word segmentation in TCM. In the Siamese network, the traditional fully connected neural network was replaced by a deep bidirectional long short-term memory (BLSTM) to capture the contextual semantics of the current word information. The improved similarity BLSTM was used to map the sentence that is to be tested into two sets of low-dimensional numerical vectors. Then, we performed similarity calculation training. Experiments on the two datasets of financial and TCM show that the performance of the BTSN model in this study was better than that of other similarity calculation models. When the number of layers of the BLSTM reached 6 layers, the accuracy of the model was the highest. This verifies that the text similarity calculation model proposed in this study has high engineering value. Hindawi 2021-11-29 /pmc/articles/PMC8648447/ /pubmed/34880918 http://dx.doi.org/10.1155/2021/2337924 Text en Copyright © 2021 Jigen Luo et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Luo, Jigen
Xiong, Wangping
Du, Jianqiang
Liu, Yingfeng
Li, Jianwen
Hu, Dingxing
Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network
title Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network
title_full Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network
title_fullStr Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network
title_full_unstemmed Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network
title_short Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network
title_sort traditional chinese medicine text similarity calculation model based on the bidirectional temporal siamese network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8648447/
https://www.ncbi.nlm.nih.gov/pubmed/34880918
http://dx.doi.org/10.1155/2021/2337924
work_keys_str_mv AT luojigen traditionalchinesemedicinetextsimilaritycalculationmodelbasedonthebidirectionaltemporalsiamesenetwork
AT xiongwangping traditionalchinesemedicinetextsimilaritycalculationmodelbasedonthebidirectionaltemporalsiamesenetwork
AT dujianqiang traditionalchinesemedicinetextsimilaritycalculationmodelbasedonthebidirectionaltemporalsiamesenetwork
AT liuyingfeng traditionalchinesemedicinetextsimilaritycalculationmodelbasedonthebidirectionaltemporalsiamesenetwork
AT lijianwen traditionalchinesemedicinetextsimilaritycalculationmodelbasedonthebidirectionaltemporalsiamesenetwork
AT hudingxing traditionalchinesemedicinetextsimilaritycalculationmodelbasedonthebidirectionaltemporalsiamesenetwork