Cargando…

Enhancing the Generalization for Text Classification through Fusion of Backward Features

Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting...

Descripción completa

Detalles Bibliográficos
Autores principales: Seng, Dewen, Wu, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920874/
https://www.ncbi.nlm.nih.gov/pubmed/36772327
http://dx.doi.org/10.3390/s23031287
_version_ 1784887177118744576
author Seng, Dewen
Wu, Xin
author_facet Seng, Dewen
Wu, Xin
author_sort Seng, Dewen
collection PubMed
description Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%.
format Online
Article
Text
id pubmed-9920874
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99208742023-02-12 Enhancing the Generalization for Text Classification through Fusion of Backward Features Seng, Dewen Wu, Xin Sensors (Basel) Article Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%. MDPI 2023-01-23 /pmc/articles/PMC9920874/ /pubmed/36772327 http://dx.doi.org/10.3390/s23031287 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Seng, Dewen
Wu, Xin
Enhancing the Generalization for Text Classification through Fusion of Backward Features
title Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_full Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_fullStr Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_full_unstemmed Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_short Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_sort enhancing the generalization for text classification through fusion of backward features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920874/
https://www.ncbi.nlm.nih.gov/pubmed/36772327
http://dx.doi.org/10.3390/s23031287
work_keys_str_mv AT sengdewen enhancingthegeneralizationfortextclassificationthroughfusionofbackwardfeatures
AT wuxin enhancingthegeneralizationfortextclassificationthroughfusionofbackwardfeatures