Cargando…
Multimodal interaction enhanced representation learning for video emotion recognition
Video emotion recognition aims to infer human emotional states from the audio, visual, and text modalities. Previous approaches are centered around designing sophisticated fusion mechanisms, but usually ignore the fact that text contains global semantic information, while speech and face video show...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806211/ https://www.ncbi.nlm.nih.gov/pubmed/36601594 http://dx.doi.org/10.3389/fnins.2022.1086380 |