Cargando…

Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency

Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) networ...

Descripción completa

Detalles Bibliográficos
Autores principales: Lv, Ying, Zhou, Wujie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7700038/
https://www.ncbi.nlm.nih.gov/pubmed/33293945
http://dx.doi.org/10.1155/2020/8841681
_version_ 1783616185809502208
author Lv, Ying
Zhou, Wujie
author_facet Lv, Ying
Zhou, Wujie
author_sort Lv, Ying
collection PubMed
description Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) network to facilitate end-to-end prediction of RGB-D saliency. In the proposed method, hierarchical (multilevel) multimodal features are first extracted from an RGB image and depth map using a VGG-16-based two-stream network. Subsequently, the most significant hierarchical features of the said RGB image and depth map are predicted using three two-input attention modules. Furthermore, adaptive fusion of saliencies concerning the above-mentioned fused saliency features of different levels (hierarchical fusion saliency features) can be accomplished using a three-input attention module to facilitate high-accuracy RGB-D visual saliency prediction. Comparisons based on the application of the proposed HMAF-based approach against those of other state-of-the-art techniques on two challenging RGB-D datasets demonstrate that the proposed method outperforms other competing approaches consistently by a considerable margin.
format Online
Article
Text
id pubmed-7700038
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-77000382020-12-07 Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency Lv, Ying Zhou, Wujie Comput Intell Neurosci Research Article Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) network to facilitate end-to-end prediction of RGB-D saliency. In the proposed method, hierarchical (multilevel) multimodal features are first extracted from an RGB image and depth map using a VGG-16-based two-stream network. Subsequently, the most significant hierarchical features of the said RGB image and depth map are predicted using three two-input attention modules. Furthermore, adaptive fusion of saliencies concerning the above-mentioned fused saliency features of different levels (hierarchical fusion saliency features) can be accomplished using a three-input attention module to facilitate high-accuracy RGB-D visual saliency prediction. Comparisons based on the application of the proposed HMAF-based approach against those of other state-of-the-art techniques on two challenging RGB-D datasets demonstrate that the proposed method outperforms other competing approaches consistently by a considerable margin. Hindawi 2020-11-20 /pmc/articles/PMC7700038/ /pubmed/33293945 http://dx.doi.org/10.1155/2020/8841681 Text en Copyright © 2020 Ying Lv and Wujie Zhou. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Lv, Ying
Zhou, Wujie
Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
title Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
title_full Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
title_fullStr Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
title_full_unstemmed Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
title_short Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
title_sort hierarchical multimodal adaptive fusion (hmaf) network for prediction of rgb-d saliency
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7700038/
https://www.ncbi.nlm.nih.gov/pubmed/33293945
http://dx.doi.org/10.1155/2020/8841681
work_keys_str_mv AT lvying hierarchicalmultimodaladaptivefusionhmafnetworkforpredictionofrgbdsaliency
AT zhouwujie hierarchicalmultimodaladaptivefusionhmafnetworkforpredictionofrgbdsaliency