Cargando…

Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion

The rising use of online media has changed the social customs of the public. Users have become accustomed to sharing daily experiences and publishing personal opinions on social networks. Social data carrying emotion and attitude has provided significant decision support for numerous tasks in sentim...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Sun, Li, Bo, Yin, Chunyong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8747244/
https://www.ncbi.nlm.nih.gov/pubmed/35009620
http://dx.doi.org/10.3390/s22010074
_version_ 1784630786741239808
author Zhang, Sun
Li, Bo
Yin, Chunyong
author_facet Zhang, Sun
Li, Bo
Yin, Chunyong
author_sort Zhang, Sun
collection PubMed
description The rising use of online media has changed the social customs of the public. Users have become accustomed to sharing daily experiences and publishing personal opinions on social networks. Social data carrying emotion and attitude has provided significant decision support for numerous tasks in sentiment analysis. Conventional methods for sentiment classification only concern textual modality and are vulnerable to the multimodal scenario, while common multimodal approaches only focus on the interactive relationship among modalities without considering unique intra-modal information. A hybrid fusion network is proposed in this paper to capture both inter-modal and intra-modal features. Firstly, in the stage of representation fusion, a multi-head visual attention is proposed to extract accurate semantic and sentimental information from textual contents, with the guidance of visual features. Then, multiple base classifiers are trained to learn independent and diverse discriminative information from different modal representations in the stage of decision fusion. The final decision is determined based on fusing the decision supports from base classifiers via a decision fusion method. To improve the generalization of our hybrid fusion network, a similarity loss is employed to inject decision diversity into the whole model. Empiric results on five multimodal datasets have demonstrated that the proposed model achieves higher accuracy and better generalization capacity for multimodal sentiment analysis.
format Online
Article
Text
id pubmed-8747244
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-87472442022-01-11 Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion Zhang, Sun Li, Bo Yin, Chunyong Sensors (Basel) Article The rising use of online media has changed the social customs of the public. Users have become accustomed to sharing daily experiences and publishing personal opinions on social networks. Social data carrying emotion and attitude has provided significant decision support for numerous tasks in sentiment analysis. Conventional methods for sentiment classification only concern textual modality and are vulnerable to the multimodal scenario, while common multimodal approaches only focus on the interactive relationship among modalities without considering unique intra-modal information. A hybrid fusion network is proposed in this paper to capture both inter-modal and intra-modal features. Firstly, in the stage of representation fusion, a multi-head visual attention is proposed to extract accurate semantic and sentimental information from textual contents, with the guidance of visual features. Then, multiple base classifiers are trained to learn independent and diverse discriminative information from different modal representations in the stage of decision fusion. The final decision is determined based on fusing the decision supports from base classifiers via a decision fusion method. To improve the generalization of our hybrid fusion network, a similarity loss is employed to inject decision diversity into the whole model. Empiric results on five multimodal datasets have demonstrated that the proposed model achieves higher accuracy and better generalization capacity for multimodal sentiment analysis. MDPI 2021-12-23 /pmc/articles/PMC8747244/ /pubmed/35009620 http://dx.doi.org/10.3390/s22010074 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Sun
Li, Bo
Yin, Chunyong
Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion
title Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion
title_full Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion
title_fullStr Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion
title_full_unstemmed Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion
title_short Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion
title_sort cross-modal sentiment sensing with visual-augmented representation and diverse decision fusion
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8747244/
https://www.ncbi.nlm.nih.gov/pubmed/35009620
http://dx.doi.org/10.3390/s22010074
work_keys_str_mv AT zhangsun crossmodalsentimentsensingwithvisualaugmentedrepresentationanddiversedecisionfusion
AT libo crossmodalsentimentsensingwithvisualaugmentedrepresentationanddiversedecisionfusion
AT yinchunyong crossmodalsentimentsensingwithvisualaugmentedrepresentationanddiversedecisionfusion