Cargando…

Automatic symptom name normalization in clinical records of traditional Chinese medicine

BACKGROUND: In recent years, Data Mining technology has been applied more than ever before in the field of traditional Chinese medicine (TCM) to discover regularities from the experience accumulated in the past thousands of years in China. Electronic medical records (or clinical records) of TCM, con...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yaqiang, Yu, Zhonghua, Jiang, Yongguang, Xu, Kaikuo, Chen, Xia
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098075/
https://www.ncbi.nlm.nih.gov/pubmed/20089162
http://dx.doi.org/10.1186/1471-2105-11-40
_version_ 1782203911654866944
author Wang, Yaqiang
Yu, Zhonghua
Jiang, Yongguang
Xu, Kaikuo
Chen, Xia
author_facet Wang, Yaqiang
Yu, Zhonghua
Jiang, Yongguang
Xu, Kaikuo
Chen, Xia
author_sort Wang, Yaqiang
collection PubMed
description BACKGROUND: In recent years, Data Mining technology has been applied more than ever before in the field of traditional Chinese medicine (TCM) to discover regularities from the experience accumulated in the past thousands of years in China. Electronic medical records (or clinical records) of TCM, containing larger amount of information than well-structured data of prescriptions extracted manually from TCM literature such as information related to medical treatment process, could be an important source for discovering valuable regularities of TCM. However, they are collected by TCM doctors on a day to day basis without the support of authoritative editorial board, and owing to different experience and background of TCM doctors, the same concept might be described in several different terms. Therefore, clinical records of TCM cannot be used directly to Data Mining and Knowledge Discovery. This paper focuses its attention on the phenomena of "one symptom with different names" and investigates a series of metrics for automatically normalizing symptom names in clinical records of TCM. RESULTS: A series of extensive experiments were performed to validate the metrics proposed, and they have shown that the hybrid similarity metrics integrating literal similarity and remedy-based similarity are more accurate than the others which are based on literal similarity or remedy-based similarity alone, and the highest F-Measure (65.62%) of all the metrics is achieved by hybrid similarity metric VSM+TFIDF+SWD. CONCLUSIONS: Automatic symptom name normalization is an essential task for discovering knowledge from clinical data of TCM. The problem is introduced for the first time by this paper. The results have verified that the investigated metrics are reasonable and accurate, and the hybrid similarity metrics are much better than the metrics based on literal similarity or remedy-based similarity alone.
format Text
id pubmed-3098075
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30980752011-05-20 Automatic symptom name normalization in clinical records of traditional Chinese medicine Wang, Yaqiang Yu, Zhonghua Jiang, Yongguang Xu, Kaikuo Chen, Xia BMC Bioinformatics Research Article BACKGROUND: In recent years, Data Mining technology has been applied more than ever before in the field of traditional Chinese medicine (TCM) to discover regularities from the experience accumulated in the past thousands of years in China. Electronic medical records (or clinical records) of TCM, containing larger amount of information than well-structured data of prescriptions extracted manually from TCM literature such as information related to medical treatment process, could be an important source for discovering valuable regularities of TCM. However, they are collected by TCM doctors on a day to day basis without the support of authoritative editorial board, and owing to different experience and background of TCM doctors, the same concept might be described in several different terms. Therefore, clinical records of TCM cannot be used directly to Data Mining and Knowledge Discovery. This paper focuses its attention on the phenomena of "one symptom with different names" and investigates a series of metrics for automatically normalizing symptom names in clinical records of TCM. RESULTS: A series of extensive experiments were performed to validate the metrics proposed, and they have shown that the hybrid similarity metrics integrating literal similarity and remedy-based similarity are more accurate than the others which are based on literal similarity or remedy-based similarity alone, and the highest F-Measure (65.62%) of all the metrics is achieved by hybrid similarity metric VSM+TFIDF+SWD. CONCLUSIONS: Automatic symptom name normalization is an essential task for discovering knowledge from clinical data of TCM. The problem is introduced for the first time by this paper. The results have verified that the investigated metrics are reasonable and accurate, and the hybrid similarity metrics are much better than the metrics based on literal similarity or remedy-based similarity alone. BioMed Central 2010-01-20 /pmc/articles/PMC3098075/ /pubmed/20089162 http://dx.doi.org/10.1186/1471-2105-11-40 Text en Copyright ©2010 Wang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wang, Yaqiang
Yu, Zhonghua
Jiang, Yongguang
Xu, Kaikuo
Chen, Xia
Automatic symptom name normalization in clinical records of traditional Chinese medicine
title Automatic symptom name normalization in clinical records of traditional Chinese medicine
title_full Automatic symptom name normalization in clinical records of traditional Chinese medicine
title_fullStr Automatic symptom name normalization in clinical records of traditional Chinese medicine
title_full_unstemmed Automatic symptom name normalization in clinical records of traditional Chinese medicine
title_short Automatic symptom name normalization in clinical records of traditional Chinese medicine
title_sort automatic symptom name normalization in clinical records of traditional chinese medicine
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098075/
https://www.ncbi.nlm.nih.gov/pubmed/20089162
http://dx.doi.org/10.1186/1471-2105-11-40
work_keys_str_mv AT wangyaqiang automaticsymptomnamenormalizationinclinicalrecordsoftraditionalchinesemedicine
AT yuzhonghua automaticsymptomnamenormalizationinclinicalrecordsoftraditionalchinesemedicine
AT jiangyongguang automaticsymptomnamenormalizationinclinicalrecordsoftraditionalchinesemedicine
AT xukaikuo automaticsymptomnamenormalizationinclinicalrecordsoftraditionalchinesemedicine
AT chenxia automaticsymptomnamenormalizationinclinicalrecordsoftraditionalchinesemedicine