Cargando…

Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine

BACKGROUND: The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhou, Lu, Liu, Shuangqiao, Li, Caiyan, Sun, Yuemeng, Zhang, Yizhuo, Li, Yuda, Yuan, Huimin, Sun, Yan, Xu, Fengqin, Li, Yuhang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8523248/
https://www.ncbi.nlm.nih.gov/pubmed/34671408
http://dx.doi.org/10.1155/2021/6676607
_version_ 1784585260243091456
author Zhou, Lu
Liu, Shuangqiao
Li, Caiyan
Sun, Yuemeng
Zhang, Yizhuo
Li, Yuda
Yuan, Huimin
Sun, Yan
Xu, Fengqin
Li, Yuhang
author_facet Zhou, Lu
Liu, Shuangqiao
Li, Caiyan
Sun, Yuemeng
Zhang, Yizhuo
Li, Yuda
Yuan, Huimin
Sun, Yan
Xu, Fengqin
Li, Yuhang
author_sort Zhou, Lu
collection PubMed
description BACKGROUND: The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. METHODS: Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. RESULTS: The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. CONCLUSIONS: The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms.
format Online
Article
Text
id pubmed-8523248
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-85232482021-10-19 Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine Zhou, Lu Liu, Shuangqiao Li, Caiyan Sun, Yuemeng Zhang, Yizhuo Li, Yuda Yuan, Huimin Sun, Yan Xu, Fengqin Li, Yuhang Evid Based Complement Alternat Med Research Article BACKGROUND: The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. METHODS: Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. RESULTS: The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. CONCLUSIONS: The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms. Hindawi 2021-10-11 /pmc/articles/PMC8523248/ /pubmed/34671408 http://dx.doi.org/10.1155/2021/6676607 Text en Copyright © 2021 Lu Zhou et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Zhou, Lu
Liu, Shuangqiao
Li, Caiyan
Sun, Yuemeng
Zhang, Yizhuo
Li, Yuda
Yuan, Huimin
Sun, Yan
Xu, Fengqin
Li, Yuhang
Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
title Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
title_full Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
title_fullStr Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
title_full_unstemmed Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
title_short Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
title_sort natural language processing algorithms for normalizing expressions of synonymous symptoms in traditional chinese medicine
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8523248/
https://www.ncbi.nlm.nih.gov/pubmed/34671408
http://dx.doi.org/10.1155/2021/6676607
work_keys_str_mv AT zhoulu naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT liushuangqiao naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT licaiyan naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT sunyuemeng naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT zhangyizhuo naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT liyuda naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT yuanhuimin naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT sunyan naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT xufengqin naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine
AT liyuhang naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine