Cargando…

An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification

Purpose: This study proposes an S-TextBLCNN model for the efficacy of traditional Chinese medicine (TCM) formula classification. This model uses deep learning to analyze the relationship between herb efficacy and formula efficacy, which is helpful in further exploring the internal rules of formula c...

Descripción completa

Detalles Bibliográficos
Autores principales:	Cheng, Ning, Chen, Yue, Gao, Wanqing, Liu, Jiajun, Huang, Qunfu, Yan, Cheng, Huang, Xindi, Ding, Changsong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2021
Materias:	Genetics
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8727750/ https://www.ncbi.nlm.nih.gov/pubmed/35003231 http://dx.doi.org/10.3389/fgene.2021.807825

_version_	1784626588068872192
author	Cheng, Ning Chen, Yue Gao, Wanqing Liu, Jiajun Huang, Qunfu Yan, Cheng Huang, Xindi Ding, Changsong
author_facet	Cheng, Ning Chen, Yue Gao, Wanqing Liu, Jiajun Huang, Qunfu Yan, Cheng Huang, Xindi Ding, Changsong
author_sort	Cheng, Ning
collection	PubMed
description	Purpose: This study proposes an S-TextBLCNN model for the efficacy of traditional Chinese medicine (TCM) formula classification. This model uses deep learning to analyze the relationship between herb efficacy and formula efficacy, which is helpful in further exploring the internal rules of formula combination. Methods: First, for the TCM herbs extracted from Chinese Pharmacopoeia, natural language processing (NLP) is used to learn and realize the quantitative expression of different TCM herbs. Three features of herb name, herb properties, and herb efficacy are selected to encode herbs and to construct formula-vector and herb-vector. Then, based on 2,664 formulae for stroke collected in TCM literature and 19 formula efficacy categories extracted from Yifang Jijie, an improved deep learning model TextBLCNN consists of a bidirectional long short-term memory (Bi-LSTM) neural network and a convolutional neural network (CNN) is proposed. Based on 19 formula efficacy categories, binary classifiers are established to classify the TCM formulae. Finally, aiming at the imbalance problem of formula data, the over-sampling method SMOTE is used to solve it and the S-TextBLCNN model is proposed. Results: The formula-vector composed of herb efficacy has the best effect on the classification model, so it can be inferred that there is a strong relationship between herb efficacy and formula efficacy. The TextBLCNN model has an accuracy of 0.858 and an F(1)-score of 0.762, both higher than the logistic regression (acc = 0.561, F(1)-score = 0.567), SVM (acc = 0.703, F(1)-score = 0.591), LSTM (acc = 0.723, F(1)-score = 0.621), and TextCNN (acc = 0.745, F(1)-score = 0.644) models. In addition, the over-sampling method SMOTE is used in our model to tackle data imbalance, and the F(1)-score is greatly improved by an average of 47.1% in 19 models. Conclusion: The combination of formula feature representation and the S-TextBLCNN model improve the accuracy in formula efficacy classification. It provides a new research idea for the study of TCM formula compatibility.
format	Online Article Text
id	pubmed-8727750
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-87277502022-01-06 An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification Cheng, Ning Chen, Yue Gao, Wanqing Liu, Jiajun Huang, Qunfu Yan, Cheng Huang, Xindi Ding, Changsong Front Genet Genetics Purpose: This study proposes an S-TextBLCNN model for the efficacy of traditional Chinese medicine (TCM) formula classification. This model uses deep learning to analyze the relationship between herb efficacy and formula efficacy, which is helpful in further exploring the internal rules of formula combination. Methods: First, for the TCM herbs extracted from Chinese Pharmacopoeia, natural language processing (NLP) is used to learn and realize the quantitative expression of different TCM herbs. Three features of herb name, herb properties, and herb efficacy are selected to encode herbs and to construct formula-vector and herb-vector. Then, based on 2,664 formulae for stroke collected in TCM literature and 19 formula efficacy categories extracted from Yifang Jijie, an improved deep learning model TextBLCNN consists of a bidirectional long short-term memory (Bi-LSTM) neural network and a convolutional neural network (CNN) is proposed. Based on 19 formula efficacy categories, binary classifiers are established to classify the TCM formulae. Finally, aiming at the imbalance problem of formula data, the over-sampling method SMOTE is used to solve it and the S-TextBLCNN model is proposed. Results: The formula-vector composed of herb efficacy has the best effect on the classification model, so it can be inferred that there is a strong relationship between herb efficacy and formula efficacy. The TextBLCNN model has an accuracy of 0.858 and an F(1)-score of 0.762, both higher than the logistic regression (acc = 0.561, F(1)-score = 0.567), SVM (acc = 0.703, F(1)-score = 0.591), LSTM (acc = 0.723, F(1)-score = 0.621), and TextCNN (acc = 0.745, F(1)-score = 0.644) models. In addition, the over-sampling method SMOTE is used in our model to tackle data imbalance, and the F(1)-score is greatly improved by an average of 47.1% in 19 models. Conclusion: The combination of formula feature representation and the S-TextBLCNN model improve the accuracy in formula efficacy classification. It provides a new research idea for the study of TCM formula compatibility. Frontiers Media S.A. 2021-12-22 /pmc/articles/PMC8727750/ /pubmed/35003231 http://dx.doi.org/10.3389/fgene.2021.807825 Text en Copyright © 2021 Cheng, Chen, Gao, Liu, Huang, Yan, Huang and Ding. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Genetics Cheng, Ning Chen, Yue Gao, Wanqing Liu, Jiajun Huang, Qunfu Yan, Cheng Huang, Xindi Ding, Changsong An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification
title	An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification
title_full	An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification
title_fullStr	An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification
title_full_unstemmed	An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification
title_short	An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification
title_sort	improved deep learning model: s-textblcnn for traditional chinese medicine formula classification
topic	Genetics
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8727750/ https://www.ncbi.nlm.nih.gov/pubmed/35003231 http://dx.doi.org/10.3389/fgene.2021.807825
work_keys_str_mv	AT chengning animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT chenyue animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT gaowanqing animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT liujiajun animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT huangqunfu animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT yancheng animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT huangxindi animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT dingchangsong animproveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT chengning improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT chenyue improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT gaowanqing improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT liujiajun improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT huangqunfu improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT yancheng improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT huangxindi improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification AT dingchangsong improveddeeplearningmodelstextblcnnfortraditionalchinesemedicineformulaclassification

An Improved Deep Learning Model: S-TextBLCNN for Traditional Chinese Medicine Formula Classification

Ejemplares similares