Cargando…

Feature-enhanced text-inception model for Chinese long text classification

To solve the problem regarding unbalanced distribution of multi-category Chinese long texts and improve the classification accuracy thereof, a data enhancement method was proposed. Combined with this method, a feature-enhanced text-inception model for Chinese long text classification was proposed. F...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Guo, Jiayu, Yan, Dongdong, Xu, Zelin, Guo, Hai, Huan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9902449/
https://www.ncbi.nlm.nih.gov/pubmed/36747060
http://dx.doi.org/10.1038/s41598-023-29013-0
_version_ 1784883264318603264
author Yang, Guo
Jiayu, Yan
Dongdong, Xu
Zelin, Guo
Hai, Huan
author_facet Yang, Guo
Jiayu, Yan
Dongdong, Xu
Zelin, Guo
Hai, Huan
author_sort Yang, Guo
collection PubMed
description To solve the problem regarding unbalanced distribution of multi-category Chinese long texts and improve the classification accuracy thereof, a data enhancement method was proposed. Combined with this method, a feature-enhanced text-inception model for Chinese long text classification was proposed. First, the model used a novel text-inception module to extract important shallow features of the text. Meanwhile, the bidirectional gated recurrent unit (Bi-GRU) and the capsule neural network were employed to form a deep feature extraction module to understand the semantic information in the text; K-MaxPooling was then used to reduce the dimension of its shallow and deep features and enhance the overall features. Finally, the Softmax function was used for classification. By comparing the classification effects with a variety of models, the results show that the model can significantly improve the accuracy of long Chinese text classification and has a strong ability to recognize long Chinese text features. The accuracy of the model is 93.97% when applied to an experimental dataset.
format Online
Article
Text
id pubmed-9902449
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-99024492023-02-08 Feature-enhanced text-inception model for Chinese long text classification Yang, Guo Jiayu, Yan Dongdong, Xu Zelin, Guo Hai, Huan Sci Rep Article To solve the problem regarding unbalanced distribution of multi-category Chinese long texts and improve the classification accuracy thereof, a data enhancement method was proposed. Combined with this method, a feature-enhanced text-inception model for Chinese long text classification was proposed. First, the model used a novel text-inception module to extract important shallow features of the text. Meanwhile, the bidirectional gated recurrent unit (Bi-GRU) and the capsule neural network were employed to form a deep feature extraction module to understand the semantic information in the text; K-MaxPooling was then used to reduce the dimension of its shallow and deep features and enhance the overall features. Finally, the Softmax function was used for classification. By comparing the classification effects with a variety of models, the results show that the model can significantly improve the accuracy of long Chinese text classification and has a strong ability to recognize long Chinese text features. The accuracy of the model is 93.97% when applied to an experimental dataset. Nature Publishing Group UK 2023-02-06 /pmc/articles/PMC9902449/ /pubmed/36747060 http://dx.doi.org/10.1038/s41598-023-29013-0 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Yang, Guo
Jiayu, Yan
Dongdong, Xu
Zelin, Guo
Hai, Huan
Feature-enhanced text-inception model for Chinese long text classification
title Feature-enhanced text-inception model for Chinese long text classification
title_full Feature-enhanced text-inception model for Chinese long text classification
title_fullStr Feature-enhanced text-inception model for Chinese long text classification
title_full_unstemmed Feature-enhanced text-inception model for Chinese long text classification
title_short Feature-enhanced text-inception model for Chinese long text classification
title_sort feature-enhanced text-inception model for chinese long text classification
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9902449/
https://www.ncbi.nlm.nih.gov/pubmed/36747060
http://dx.doi.org/10.1038/s41598-023-29013-0
work_keys_str_mv AT yangguo featureenhancedtextinceptionmodelforchineselongtextclassification
AT jiayuyan featureenhancedtextinceptionmodelforchineselongtextclassification
AT dongdongxu featureenhancedtextinceptionmodelforchineselongtextclassification
AT zelinguo featureenhancedtextinceptionmodelforchineselongtextclassification
AT haihuan featureenhancedtextinceptionmodelforchineselongtextclassification