Cargando…
Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval
Deep hashing is the mainstream algorithm for large-scale cross-modal retrieval due to its high retrieval speed and low storage capacity, but the problem of reconstruction of modal semantic information is still very challenging. In order to further solve the problem of unsupervised cross-modal retrie...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712897/ https://www.ncbi.nlm.nih.gov/pubmed/33287034 http://dx.doi.org/10.3390/e22111266 |
_version_ | 1783618471186137088 |
---|---|
author | Cheng, Shuli Wang, Liejun Du, Anyu |
author_facet | Cheng, Shuli Wang, Liejun Du, Anyu |
author_sort | Cheng, Shuli |
collection | PubMed |
description | Deep hashing is the mainstream algorithm for large-scale cross-modal retrieval due to its high retrieval speed and low storage capacity, but the problem of reconstruction of modal semantic information is still very challenging. In order to further solve the problem of unsupervised cross-modal retrieval semantic reconstruction, we propose a novel deep semantic-preserving reconstruction hashing (DSPRH). The algorithm combines spatial and channel semantic information, and mines modal semantic information based on adaptive self-encoding and joint semantic reconstruction loss. The main contributions are as follows: (1) We introduce a new spatial pooling network module based on tensor regular-polymorphic decomposition theory to generate rank-1 tensor to capture high-order context semantics, which can assist the backbone network to capture important contextual modal semantic information. (2) Based on optimization perspective, we use global covariance pooling to capture channel semantic information and accelerate network convergence. In feature reconstruction layer, we use two bottlenecks auto-encoding to achieve visual-text modal interaction. (3) In metric learning, we design a new loss function to optimize model parameters, which can preserve the correlation between image modalities and text modalities. The DSPRH algorithm is tested on MIRFlickr-25K and NUS-WIDE. The experimental results show that DSPRH has achieved better performance on retrieval tasks. |
format | Online Article Text |
id | pubmed-7712897 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-77128972021-02-24 Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval Cheng, Shuli Wang, Liejun Du, Anyu Entropy (Basel) Article Deep hashing is the mainstream algorithm for large-scale cross-modal retrieval due to its high retrieval speed and low storage capacity, but the problem of reconstruction of modal semantic information is still very challenging. In order to further solve the problem of unsupervised cross-modal retrieval semantic reconstruction, we propose a novel deep semantic-preserving reconstruction hashing (DSPRH). The algorithm combines spatial and channel semantic information, and mines modal semantic information based on adaptive self-encoding and joint semantic reconstruction loss. The main contributions are as follows: (1) We introduce a new spatial pooling network module based on tensor regular-polymorphic decomposition theory to generate rank-1 tensor to capture high-order context semantics, which can assist the backbone network to capture important contextual modal semantic information. (2) Based on optimization perspective, we use global covariance pooling to capture channel semantic information and accelerate network convergence. In feature reconstruction layer, we use two bottlenecks auto-encoding to achieve visual-text modal interaction. (3) In metric learning, we design a new loss function to optimize model parameters, which can preserve the correlation between image modalities and text modalities. The DSPRH algorithm is tested on MIRFlickr-25K and NUS-WIDE. The experimental results show that DSPRH has achieved better performance on retrieval tasks. MDPI 2020-11-07 /pmc/articles/PMC7712897/ /pubmed/33287034 http://dx.doi.org/10.3390/e22111266 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Cheng, Shuli Wang, Liejun Du, Anyu Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval |
title | Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval |
title_full | Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval |
title_fullStr | Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval |
title_full_unstemmed | Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval |
title_short | Deep Semantic-Preserving Reconstruction Hashing for Unsupervised Cross-Modal Retrieval |
title_sort | deep semantic-preserving reconstruction hashing for unsupervised cross-modal retrieval |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7712897/ https://www.ncbi.nlm.nih.gov/pubmed/33287034 http://dx.doi.org/10.3390/e22111266 |
work_keys_str_mv | AT chengshuli deepsemanticpreservingreconstructionhashingforunsupervisedcrossmodalretrieval AT wangliejun deepsemanticpreservingreconstructionhashingforunsupervisedcrossmodalretrieval AT duanyu deepsemanticpreservingreconstructionhashingforunsupervisedcrossmodalretrieval |