Cargando…

Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval

Deep hashing is the mainstream algorithm for large-scale cross-modal retrieval due to its high retrieval speed and low storage capacity, but the problem of reconstruction of modal semantic information is still very challenging. In order to further solve the problem of unsupervised cross-modal retrie...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Shuli, Wang, Liejun, Du, Anyu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712897/
https://www.ncbi.nlm.nih.gov/pubmed/33287034
http://dx.doi.org/10.3390/e22111266
_version_ 1783618471186137088
author Cheng, Shuli
Wang, Liejun
Du, Anyu
author_facet Cheng, Shuli
Wang, Liejun
Du, Anyu
author_sort Cheng, Shuli
collection PubMed
description Deep hashing is the mainstream algorithm for large-scale cross-modal retrieval due to its high retrieval speed and low storage capacity, but the problem of reconstruction of modal semantic information is still very challenging. In order to further solve the problem of unsupervised cross-modal retrieval semantic reconstruction, we propose a novel deep semantic-preserving reconstruction hashing (DSPRH). The algorithm combines spatial and channel semantic information, and mines modal semantic information based on adaptive self-encoding and joint semantic reconstruction loss. The main contributions are as follows: (1) We introduce a new spatial pooling network module based on tensor regular-polymorphic decomposition theory to generate rank-1 tensor to capture high-order context semantics, which can assist the backbone network to capture important contextual modal semantic information. (2) Based on optimization perspective, we use global covariance pooling to capture channel semantic information and accelerate network convergence. In feature reconstruction layer, we use two bottlenecks auto-encoding to achieve visual-text modal interaction. (3) In metric learning, we design a new loss function to optimize model parameters, which can preserve the correlation between image modalities and text modalities. The DSPRH algorithm is tested on MIRFlickr-25K and NUS-WIDE. The experimental results show that DSPRH has achieved better performance on retrieval tasks.
format Online
Article
Text
id pubmed-7712897
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-77128972021-02-24 Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval Cheng, Shuli Wang, Liejun Du, Anyu Entropy (Basel) Article Deep hashing is the mainstream algorithm for large-scale cross-modal retrieval due to its high retrieval speed and low storage capacity, but the problem of reconstruction of modal semantic information is still very challenging. In order to further solve the problem of unsupervised cross-modal retrieval semantic reconstruction, we propose a novel deep semantic-preserving reconstruction hashing (DSPRH). The algorithm combines spatial and channel semantic information, and mines modal semantic information based on adaptive self-encoding and joint semantic reconstruction loss. The main contributions are as follows: (1) We introduce a new spatial pooling network module based on tensor regular-polymorphic decomposition theory to generate rank-1 tensor to capture high-order context semantics, which can assist the backbone network to capture important contextual modal semantic information. (2) Based on optimization perspective, we use global covariance pooling to capture channel semantic information and accelerate network convergence. In feature reconstruction layer, we use two bottlenecks auto-encoding to achieve visual-text modal interaction. (3) In metric learning, we design a new loss function to optimize model parameters, which can preserve the correlation between image modalities and text modalities. The DSPRH algorithm is tested on MIRFlickr-25K and NUS-WIDE. The experimental results show that DSPRH has achieved better performance on retrieval tasks. MDPI 2020-11-07 /pmc/articles/PMC7712897/ /pubmed/33287034 http://dx.doi.org/10.3390/e22111266 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cheng, Shuli
Wang, Liejun
Du, Anyu
Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
title Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
title_full Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
title_fullStr Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
title_full_unstemmed Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
title_short Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
title_sort deep semantic-preserving reconstruction hashing for unsupervised cross-modal retrieval
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712897/
https://www.ncbi.nlm.nih.gov/pubmed/33287034
http://dx.doi.org/10.3390/e22111266
work_keys_str_mv AT chengshuli deepsemanticpreservingreconstructionhashingforunsupervisedcrossmodalretrieval
AT wangliejun deepsemanticpreservingreconstructionhashingforunsupervisedcrossmodalretrieval
AT duanyu deepsemanticpreservingreconstructionhashingforunsupervisedcrossmodalretrieval