Cargando…
Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency
Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) networ...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7700038/ https://www.ncbi.nlm.nih.gov/pubmed/33293945 http://dx.doi.org/10.1155/2020/8841681 |
_version_ | 1783616185809502208 |
---|---|
author | Lv, Ying Zhou, Wujie |
author_facet | Lv, Ying Zhou, Wujie |
author_sort | Lv, Ying |
collection | PubMed |
description | Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) network to facilitate end-to-end prediction of RGB-D saliency. In the proposed method, hierarchical (multilevel) multimodal features are first extracted from an RGB image and depth map using a VGG-16-based two-stream network. Subsequently, the most significant hierarchical features of the said RGB image and depth map are predicted using three two-input attention modules. Furthermore, adaptive fusion of saliencies concerning the above-mentioned fused saliency features of different levels (hierarchical fusion saliency features) can be accomplished using a three-input attention module to facilitate high-accuracy RGB-D visual saliency prediction. Comparisons based on the application of the proposed HMAF-based approach against those of other state-of-the-art techniques on two challenging RGB-D datasets demonstrate that the proposed method outperforms other competing approaches consistently by a considerable margin. |
format | Online Article Text |
id | pubmed-7700038 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-77000382020-12-07 Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency Lv, Ying Zhou, Wujie Comput Intell Neurosci Research Article Visual saliency prediction for RGB-D images is more challenging than that for their RGB counterparts. Additionally, very few investigations have been undertaken concerning RGB-D-saliency prediction. The proposed study presents a method based on a hierarchical multimodal adaptive fusion (HMAF) network to facilitate end-to-end prediction of RGB-D saliency. In the proposed method, hierarchical (multilevel) multimodal features are first extracted from an RGB image and depth map using a VGG-16-based two-stream network. Subsequently, the most significant hierarchical features of the said RGB image and depth map are predicted using three two-input attention modules. Furthermore, adaptive fusion of saliencies concerning the above-mentioned fused saliency features of different levels (hierarchical fusion saliency features) can be accomplished using a three-input attention module to facilitate high-accuracy RGB-D visual saliency prediction. Comparisons based on the application of the proposed HMAF-based approach against those of other state-of-the-art techniques on two challenging RGB-D datasets demonstrate that the proposed method outperforms other competing approaches consistently by a considerable margin. Hindawi 2020-11-20 /pmc/articles/PMC7700038/ /pubmed/33293945 http://dx.doi.org/10.1155/2020/8841681 Text en Copyright © 2020 Ying Lv and Wujie Zhou. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Lv, Ying Zhou, Wujie Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency |
title | Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency |
title_full | Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency |
title_fullStr | Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency |
title_full_unstemmed | Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency |
title_short | Hierarchical Multimodal Adaptive Fusion (HMAF) Network for Prediction of RGB-D Saliency |
title_sort | hierarchical multimodal adaptive fusion (hmaf) network for prediction of rgb-d saliency |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7700038/ https://www.ncbi.nlm.nih.gov/pubmed/33293945 http://dx.doi.org/10.1155/2020/8841681 |
work_keys_str_mv | AT lvying hierarchicalmultimodaladaptivefusionhmafnetworkforpredictionofrgbdsaliency AT zhouwujie hierarchicalmultimodaladaptivefusionhmafnetworkforpredictionofrgbdsaliency |