Cargando…

Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval

The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shi, Ge, Li, Feng, Wu, Lifang, Chen, Yukun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9029824/ https://www.ncbi.nlm.nih.gov/pubmed/35458906 http://dx.doi.org/10.3390/s22082921

_version_	1784691994388332544
author	Shi, Ge Li, Feng Wu, Lifang Chen, Yukun
author_facet	Shi, Ge Li, Feng Wu, Lifang Chen, Yukun
author_sort	Shi, Ge
collection	PubMed
description	The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it only relies on relevant information of the paired data, making it more applicable to real-world applications. However, two problems, that is intro-modality correlation and inter-modality correlation, still have not been fully considered. Intra-modality correlation describes the complex overall concept of a single modality and provides semantic relevance for retrieval tasks, while inter-modality correction refers to the relationship between different modalities. From our observation and hypothesis, the dependency relationship within the modality and between different modalities can be constructed at the object level, which can further improve cross-modal hashing retrieval accuracy. To this end, we propose a Visual-textful Correlation Graph Hashing (OVCGH) approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference. Specifically, a novel intra-modality correlation graph is designed to learn graph-level representations of different modalities, obtaining the dependency relationship of the image region to image region and the tag to tag in an unsupervised manner. Then, we design a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag. Extensive experiments on two widely used datasets verify the effectiveness of our proposed approach.
format	Online Article Text
id	pubmed-9029824
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-90298242022-04-23 Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval Shi, Ge Li, Feng Wu, Lifang Chen, Yukun Sensors (Basel) Article The core of cross-modal hashing methods is to map high dimensional features into binary hash codes, which can then efficiently utilize the Hamming distance metric to enhance retrieval efficiency. Recent development emphasizes the advantages of the unsupervised cross-modal hashing technique, since it only relies on relevant information of the paired data, making it more applicable to real-world applications. However, two problems, that is intro-modality correlation and inter-modality correlation, still have not been fully considered. Intra-modality correlation describes the complex overall concept of a single modality and provides semantic relevance for retrieval tasks, while inter-modality correction refers to the relationship between different modalities. From our observation and hypothesis, the dependency relationship within the modality and between different modalities can be constructed at the object level, which can further improve cross-modal hashing retrieval accuracy. To this end, we propose a Visual-textful Correlation Graph Hashing (OVCGH) approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference. Specifically, a novel intra-modality correlation graph is designed to learn graph-level representations of different modalities, obtaining the dependency relationship of the image region to image region and the tag to tag in an unsupervised manner. Then, we design a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag. Extensive experiments on two widely used datasets verify the effectiveness of our proposed approach. MDPI 2022-04-11 /pmc/articles/PMC9029824/ /pubmed/35458906 http://dx.doi.org/10.3390/s22082921 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Shi, Ge Li, Feng Wu, Lifang Chen, Yukun Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_full	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_fullStr	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_full_unstemmed	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_short	Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval
title_sort	object-level visual-text correlation graph hashing for unsupervised cross-modal retrieval
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9029824/ https://www.ncbi.nlm.nih.gov/pubmed/35458906 http://dx.doi.org/10.3390/s22082921
work_keys_str_mv	AT shige objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval AT lifeng objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval AT wulifang objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval AT chenyukun objectlevelvisualtextcorrelationgraphhashingforunsupervisedcrossmodalretrieval

Object-Level Visual-Text Correlation Graph Hashing for Unsupervised Cross-Modal Retrieval

Ejemplares similares