Cargando…
HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval
Image-text retrieval aims to search related results of one modality by querying another modality. As a fundamental and key problem in cross-modal retrieval, image-text retrieval is still a challenging problem owing to the complementary and imbalanced relationship between different modalities (i.e.,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007124/ https://www.ncbi.nlm.nih.gov/pubmed/36904776 http://dx.doi.org/10.3390/s23052559 |
_version_ | 1784905440762527744 |
---|---|
author | Wang, Shuhuai Liu, Zheng Pei, Xinlei Xu, Junhao |
author_facet | Wang, Shuhuai Liu, Zheng Pei, Xinlei Xu, Junhao |
author_sort | Wang, Shuhuai |
collection | PubMed |
description | Image-text retrieval aims to search related results of one modality by querying another modality. As a fundamental and key problem in cross-modal retrieval, image-text retrieval is still a challenging problem owing to the complementary and imbalanced relationship between different modalities (i.e., Image and Text) and different granularities (i.e., Global-level and Local-level). However, existing works have not fully considered how to effectively mine and fuse the complementarities between images and texts at different granularities. Therefore, in this paper, we propose a hierarchical adaptive alignment network, whose contributions are as follows: (1) We propose a multi-level alignment network, which simultaneously mines global-level and local-level data, thereby enhancing the semantic association between images and texts. (2) We propose an adaptive weighted loss to flexibly optimize the image-text similarity with two stages in a unified framework. (3) We conduct extensive experiments on three public benchmark datasets (Corel 5K, Pascal Sentence, and Wiki) and compare them with eleven state-of-the-art methods. The experimental results thoroughly verify the effectiveness of our proposed method. |
format | Online Article Text |
id | pubmed-10007124 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100071242023-03-12 HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval Wang, Shuhuai Liu, Zheng Pei, Xinlei Xu, Junhao Sensors (Basel) Article Image-text retrieval aims to search related results of one modality by querying another modality. As a fundamental and key problem in cross-modal retrieval, image-text retrieval is still a challenging problem owing to the complementary and imbalanced relationship between different modalities (i.e., Image and Text) and different granularities (i.e., Global-level and Local-level). However, existing works have not fully considered how to effectively mine and fuse the complementarities between images and texts at different granularities. Therefore, in this paper, we propose a hierarchical adaptive alignment network, whose contributions are as follows: (1) We propose a multi-level alignment network, which simultaneously mines global-level and local-level data, thereby enhancing the semantic association between images and texts. (2) We propose an adaptive weighted loss to flexibly optimize the image-text similarity with two stages in a unified framework. (3) We conduct extensive experiments on three public benchmark datasets (Corel 5K, Pascal Sentence, and Wiki) and compare them with eleven state-of-the-art methods. The experimental results thoroughly verify the effectiveness of our proposed method. MDPI 2023-02-25 /pmc/articles/PMC10007124/ /pubmed/36904776 http://dx.doi.org/10.3390/s23052559 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Shuhuai Liu, Zheng Pei, Xinlei Xu, Junhao HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval |
title | HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval |
title_full | HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval |
title_fullStr | HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval |
title_full_unstemmed | HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval |
title_short | HAAN: Learning a Hierarchical Adaptive Alignment Network for Image-Text Retrieval |
title_sort | haan: learning a hierarchical adaptive alignment network for image-text retrieval |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10007124/ https://www.ncbi.nlm.nih.gov/pubmed/36904776 http://dx.doi.org/10.3390/s23052559 |
work_keys_str_mv | AT wangshuhuai haanlearningahierarchicaladaptivealignmentnetworkforimagetextretrieval AT liuzheng haanlearningahierarchicaladaptivealignmentnetworkforimagetextretrieval AT peixinlei haanlearningahierarchicaladaptivealignmentnetworkforimagetextretrieval AT xujunhao haanlearningahierarchicaladaptivealignmentnetworkforimagetextretrieval |