Cargando…
Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
BACKGROUND: The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8523248/ https://www.ncbi.nlm.nih.gov/pubmed/34671408 http://dx.doi.org/10.1155/2021/6676607 |
_version_ | 1784585260243091456 |
---|---|
author | Zhou, Lu Liu, Shuangqiao Li, Caiyan Sun, Yuemeng Zhang, Yizhuo Li, Yuda Yuan, Huimin Sun, Yan Xu, Fengqin Li, Yuhang |
author_facet | Zhou, Lu Liu, Shuangqiao Li, Caiyan Sun, Yuemeng Zhang, Yizhuo Li, Yuda Yuan, Huimin Sun, Yan Xu, Fengqin Li, Yuhang |
author_sort | Zhou, Lu |
collection | PubMed |
description | BACKGROUND: The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. METHODS: Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. RESULTS: The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. CONCLUSIONS: The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms. |
format | Online Article Text |
id | pubmed-8523248 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-85232482021-10-19 Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine Zhou, Lu Liu, Shuangqiao Li, Caiyan Sun, Yuemeng Zhang, Yizhuo Li, Yuda Yuan, Huimin Sun, Yan Xu, Fengqin Li, Yuhang Evid Based Complement Alternat Med Research Article BACKGROUND: The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. METHODS: Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. RESULTS: The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. CONCLUSIONS: The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms. Hindawi 2021-10-11 /pmc/articles/PMC8523248/ /pubmed/34671408 http://dx.doi.org/10.1155/2021/6676607 Text en Copyright © 2021 Lu Zhou et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zhou, Lu Liu, Shuangqiao Li, Caiyan Sun, Yuemeng Zhang, Yizhuo Li, Yuda Yuan, Huimin Sun, Yan Xu, Fengqin Li, Yuhang Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine |
title | Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine |
title_full | Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine |
title_fullStr | Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine |
title_full_unstemmed | Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine |
title_short | Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine |
title_sort | natural language processing algorithms for normalizing expressions of synonymous symptoms in traditional chinese medicine |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8523248/ https://www.ncbi.nlm.nih.gov/pubmed/34671408 http://dx.doi.org/10.1155/2021/6676607 |
work_keys_str_mv | AT zhoulu naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT liushuangqiao naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT licaiyan naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT sunyuemeng naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT zhangyizhuo naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT liyuda naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT yuanhuimin naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT sunyan naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT xufengqin naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine AT liyuhang naturallanguageprocessingalgorithmsfornormalizingexpressionsofsynonymoussymptomsintraditionalchinesemedicine |