Cargando…

Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention

With the continuous expansion of the field of natural language processing, researchers have found that there is a phenomenon of imbalanced data distribution in some practical problems, and the excellent performance of most methods is based on the assumption that the samples in the dataset are data b...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Tiantian, Zhang, Xinsheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9967901/
https://www.ncbi.nlm.nih.gov/pubmed/36850855
http://dx.doi.org/10.3390/s23042257
_version_ 1784897380608376832
author Cai, Tiantian
Zhang, Xinsheng
author_facet Cai, Tiantian
Zhang, Xinsheng
author_sort Cai, Tiantian
collection PubMed
description With the continuous expansion of the field of natural language processing, researchers have found that there is a phenomenon of imbalanced data distribution in some practical problems, and the excellent performance of most methods is based on the assumption that the samples in the dataset are data balanced. Therefore, the imbalanced data classification problem has gradually become a problem that needs to be studied. Aiming at the sentiment information mining of an imbalanced short text review dataset, this paper proposed a fusion multi-channel BLTCN-BLSTM self-attention sentiment classification method. By building a multi-channel BLTCN-BLSTM self-attention network model, the sample after word embedding processing is used as the input of the multi-channel, and after fully extracting features, the self-attention mechanism is fused to strengthen the sentiment to further fully extract text features. At the same time, focus loss rebalancing and classifier enhancement are combined to realize text sentiment predictions. The experimental results show that the optimal F1 value is up to 0.893 on the Chnsenticorp-HPL-10,000 corpus. The comparison and ablation of experimental results, including accuracy, recall, and F1-measure, show that the proposed model can fully integrate the weight of emotional feature words. It effectively improves the sentiment classification performance of imbalanced short-text review data.
format Online
Article
Text
id pubmed-9967901
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99679012023-02-27 Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention Cai, Tiantian Zhang, Xinsheng Sensors (Basel) Article With the continuous expansion of the field of natural language processing, researchers have found that there is a phenomenon of imbalanced data distribution in some practical problems, and the excellent performance of most methods is based on the assumption that the samples in the dataset are data balanced. Therefore, the imbalanced data classification problem has gradually become a problem that needs to be studied. Aiming at the sentiment information mining of an imbalanced short text review dataset, this paper proposed a fusion multi-channel BLTCN-BLSTM self-attention sentiment classification method. By building a multi-channel BLTCN-BLSTM self-attention network model, the sample after word embedding processing is used as the input of the multi-channel, and after fully extracting features, the self-attention mechanism is fused to strengthen the sentiment to further fully extract text features. At the same time, focus loss rebalancing and classifier enhancement are combined to realize text sentiment predictions. The experimental results show that the optimal F1 value is up to 0.893 on the Chnsenticorp-HPL-10,000 corpus. The comparison and ablation of experimental results, including accuracy, recall, and F1-measure, show that the proposed model can fully integrate the weight of emotional feature words. It effectively improves the sentiment classification performance of imbalanced short-text review data. MDPI 2023-02-17 /pmc/articles/PMC9967901/ /pubmed/36850855 http://dx.doi.org/10.3390/s23042257 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cai, Tiantian
Zhang, Xinsheng
Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention
title Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention
title_full Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention
title_fullStr Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention
title_full_unstemmed Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention
title_short Imbalanced Text Sentiment Classification Based on Multi-Channel BLTCN-BLSTM Self-Attention
title_sort imbalanced text sentiment classification based on multi-channel bltcn-blstm self-attention
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9967901/
https://www.ncbi.nlm.nih.gov/pubmed/36850855
http://dx.doi.org/10.3390/s23042257
work_keys_str_mv AT caitiantian imbalancedtextsentimentclassificationbasedonmultichannelbltcnblstmselfattention
AT zhangxinsheng imbalancedtextsentimentclassificationbasedonmultichannelbltcnblstmselfattention