Cargando…
Enhancing the Generalization for Text Classification through Fusion of Backward Features
Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920874/ https://www.ncbi.nlm.nih.gov/pubmed/36772327 http://dx.doi.org/10.3390/s23031287 |
_version_ | 1784887177118744576 |
---|---|
author | Seng, Dewen Wu, Xin |
author_facet | Seng, Dewen Wu, Xin |
author_sort | Seng, Dewen |
collection | PubMed |
description | Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%. |
format | Online Article Text |
id | pubmed-9920874 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-99208742023-02-12 Enhancing the Generalization for Text Classification through Fusion of Backward Features Seng, Dewen Wu, Xin Sensors (Basel) Article Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%. MDPI 2023-01-23 /pmc/articles/PMC9920874/ /pubmed/36772327 http://dx.doi.org/10.3390/s23031287 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Seng, Dewen Wu, Xin Enhancing the Generalization for Text Classification through Fusion of Backward Features |
title | Enhancing the Generalization for Text Classification through Fusion of Backward Features |
title_full | Enhancing the Generalization for Text Classification through Fusion of Backward Features |
title_fullStr | Enhancing the Generalization for Text Classification through Fusion of Backward Features |
title_full_unstemmed | Enhancing the Generalization for Text Classification through Fusion of Backward Features |
title_short | Enhancing the Generalization for Text Classification through Fusion of Backward Features |
title_sort | enhancing the generalization for text classification through fusion of backward features |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9920874/ https://www.ncbi.nlm.nih.gov/pubmed/36772327 http://dx.doi.org/10.3390/s23031287 |
work_keys_str_mv | AT sengdewen enhancingthegeneralizationfortextclassificationthroughfusionofbackwardfeatures AT wuxin enhancingthegeneralizationfortextclassificationthroughfusionofbackwardfeatures |