Cargando…

A Survey on Learning Objects' Relationship for Image Captioning

Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of import...

Descripción completa

Detalles Bibliográficos
Autores principales: Runyan, Du, Wenkai, Zhang, Zhi, Guo, Xian, Sun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241575/
https://www.ncbi.nlm.nih.gov/pubmed/37284051
http://dx.doi.org/10.1155/2023/8600853
_version_ 1785054015157960704
author Runyan, Du
Wenkai, Zhang
Zhi, Guo
Xian, Sun
author_facet Runyan, Du
Wenkai, Zhang
Zhi, Guo
Xian, Sun
author_sort Runyan, Du
collection PubMed
description Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of importance in generating a more vivid and readable sentence. Many types of research have been done in relationship mining and learning for leveraging into the caption models. This paper mainly summarizes the methods of relational representation and relational encoding in image captioning. Besides, we discuss the advantages and disadvantages of these methods and provide commonly used datasets for the relational captioning task. Finally, the current problems and challenges in this task are highlighted.
format Online
Article
Text
id pubmed-10241575
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-102415752023-06-06 A Survey on Learning Objects' Relationship for Image Captioning Runyan, Du Wenkai, Zhang Zhi, Guo Xian, Sun Comput Intell Neurosci Research Article Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of importance in generating a more vivid and readable sentence. Many types of research have been done in relationship mining and learning for leveraging into the caption models. This paper mainly summarizes the methods of relational representation and relational encoding in image captioning. Besides, we discuss the advantages and disadvantages of these methods and provide commonly used datasets for the relational captioning task. Finally, the current problems and challenges in this task are highlighted. Hindawi 2023-05-29 /pmc/articles/PMC10241575/ /pubmed/37284051 http://dx.doi.org/10.1155/2023/8600853 Text en Copyright © 2023 Du Runyan et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Runyan, Du
Wenkai, Zhang
Zhi, Guo
Xian, Sun
A Survey on Learning Objects' Relationship for Image Captioning
title A Survey on Learning Objects' Relationship for Image Captioning
title_full A Survey on Learning Objects' Relationship for Image Captioning
title_fullStr A Survey on Learning Objects' Relationship for Image Captioning
title_full_unstemmed A Survey on Learning Objects' Relationship for Image Captioning
title_short A Survey on Learning Objects' Relationship for Image Captioning
title_sort survey on learning objects' relationship for image captioning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241575/
https://www.ncbi.nlm.nih.gov/pubmed/37284051
http://dx.doi.org/10.1155/2023/8600853
work_keys_str_mv AT runyandu asurveyonlearningobjectsrelationshipforimagecaptioning
AT wenkaizhang asurveyonlearningobjectsrelationshipforimagecaptioning
AT zhiguo asurveyonlearningobjectsrelationshipforimagecaptioning
AT xiansun asurveyonlearningobjectsrelationshipforimagecaptioning
AT runyandu surveyonlearningobjectsrelationshipforimagecaptioning
AT wenkaizhang surveyonlearningobjectsrelationshipforimagecaptioning
AT zhiguo surveyonlearningobjectsrelationshipforimagecaptioning
AT xiansun surveyonlearningobjectsrelationshipforimagecaptioning