Cargando…

Multi-modal adaptive gated mechanism for visual question answering

Visual Question Answering (VQA) is a multimodal task that uses natural language to ask and answer questions based on image content. For multimodal tasks, obtaining accurate modality feature information is crucial. The existing researches on the visual question answering model mainly start from the p...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Yangshuyi, Zhang, Lin, Shen, Xiang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10306234/
https://www.ncbi.nlm.nih.gov/pubmed/37379280
http://dx.doi.org/10.1371/journal.pone.0287557