Cargando…

Fusion of Multi-Modal Features to Enhance Dense Video Caption

Dense video caption is a task that aims to help computers analyze the content of a video by generating abstract captions for a sequence of video frames. However, most of the existing methods only use visual features in the video and ignore the audio features that are also essential for understanding...

Descripción completa

Detalles Bibliográficos
Autores principales:	Huang, Xuefei, Chan, Ka-Hou, Wu, Weifan, Sheng, Hao, Ke, Wei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10304565/ https://www.ncbi.nlm.nih.gov/pubmed/37420732 http://dx.doi.org/10.3390/s23125565

Internet

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10304565/
https://www.ncbi.nlm.nih.gov/pubmed/37420732
http://dx.doi.org/10.3390/s23125565

Fusion of Multi-Modal Features to Enhance Dense Video Caption

Internet

Ejemplares similares