Cargando…
Multimodal interaction enhanced representation learning for video emotion recognition
Video emotion recognition aims to infer human emotional states from the audio, visual, and text modalities. Previous approaches are centered around designing sophisticated fusion mechanisms, but usually ignore the fact that text contains global semantic information, while speech and face video show...
Autores principales: | Xia, Xiaohan, Zhao, Yong, Jiang, Dongmei |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806211/ https://www.ncbi.nlm.nih.gov/pubmed/36601594 http://dx.doi.org/10.3389/fnins.2022.1086380 |
Ejemplares similares
-
Multimodal Recognition of Emotions in Music and Facial Expressions
por: Proverbio, Alice Mado, et al.
Publicado: (2020) -
Contrastive self-supervised representation learning without negative samples for multimodal human action recognition
por: Yang, Huaigang, et al.
Publicado: (2023) -
Multimodal transformer augmented fusion for speech emotion recognition
por: Wang, Yuanyuan, et al.
Publicado: (2023) -
Embodied Object Representation Learning and Recognition
por: Van de Maele, Toon, et al.
Publicado: (2022) -
Unimodal statistical learning produces multimodal object-like representations
por: Lengyel, Gábor, et al.
Publicado: (2019)