Cargando…
A Survey on Learning Objects' Relationship for Image Captioning
Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of import...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241575/ https://www.ncbi.nlm.nih.gov/pubmed/37284051 http://dx.doi.org/10.1155/2023/8600853 |
_version_ | 1785054015157960704 |
---|---|
author | Runyan, Du Wenkai, Zhang Zhi, Guo Xian, Sun |
author_facet | Runyan, Du Wenkai, Zhang Zhi, Guo Xian, Sun |
author_sort | Runyan, Du |
collection | PubMed |
description | Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of importance in generating a more vivid and readable sentence. Many types of research have been done in relationship mining and learning for leveraging into the caption models. This paper mainly summarizes the methods of relational representation and relational encoding in image captioning. Besides, we discuss the advantages and disadvantages of these methods and provide commonly used datasets for the relational captioning task. Finally, the current problems and challenges in this task are highlighted. |
format | Online Article Text |
id | pubmed-10241575 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-102415752023-06-06 A Survey on Learning Objects' Relationship for Image Captioning Runyan, Du Wenkai, Zhang Zhi, Guo Xian, Sun Comput Intell Neurosci Research Article Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of importance in generating a more vivid and readable sentence. Many types of research have been done in relationship mining and learning for leveraging into the caption models. This paper mainly summarizes the methods of relational representation and relational encoding in image captioning. Besides, we discuss the advantages and disadvantages of these methods and provide commonly used datasets for the relational captioning task. Finally, the current problems and challenges in this task are highlighted. Hindawi 2023-05-29 /pmc/articles/PMC10241575/ /pubmed/37284051 http://dx.doi.org/10.1155/2023/8600853 Text en Copyright © 2023 Du Runyan et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Runyan, Du Wenkai, Zhang Zhi, Guo Xian, Sun A Survey on Learning Objects' Relationship for Image Captioning |
title | A Survey on Learning Objects' Relationship for Image Captioning |
title_full | A Survey on Learning Objects' Relationship for Image Captioning |
title_fullStr | A Survey on Learning Objects' Relationship for Image Captioning |
title_full_unstemmed | A Survey on Learning Objects' Relationship for Image Captioning |
title_short | A Survey on Learning Objects' Relationship for Image Captioning |
title_sort | survey on learning objects' relationship for image captioning |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241575/ https://www.ncbi.nlm.nih.gov/pubmed/37284051 http://dx.doi.org/10.1155/2023/8600853 |
work_keys_str_mv | AT runyandu asurveyonlearningobjectsrelationshipforimagecaptioning AT wenkaizhang asurveyonlearningobjectsrelationshipforimagecaptioning AT zhiguo asurveyonlearningobjectsrelationshipforimagecaptioning AT xiansun asurveyonlearningobjectsrelationshipforimagecaptioning AT runyandu surveyonlearningobjectsrelationshipforimagecaptioning AT wenkaizhang surveyonlearningobjectsrelationshipforimagecaptioning AT zhiguo surveyonlearningobjectsrelationshipforimagecaptioning AT xiansun surveyonlearningobjectsrelationshipforimagecaptioning |