Cargando…

GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation

Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Cong, Xu, Ke, Ma, Yanxin, Wan, Jianwei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047826/ https://www.ncbi.nlm.nih.gov/pubmed/36981310 http://dx.doi.org/10.3390/e25030421

_version_	1785014024097759232
author	Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei
author_facet	Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei
author_sort	Zhang, Cong
collection	PubMed
description	Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local–global feature fusion module is designed to improve the depth map’s representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane.
format	Online Article Text
id	pubmed-10047826
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-100478262023-03-29 GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei Entropy (Basel) Article Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local–global feature fusion module is designed to improve the depth map’s representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane. MDPI 2023-02-26 /pmc/articles/PMC10047826/ /pubmed/36981310 http://dx.doi.org/10.3390/e25030421 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title	GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_full	GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_fullStr	GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_full_unstemmed	GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_short	GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_sort	gfi-net: global feature interaction network for monocular depth estimation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047826/ https://www.ncbi.nlm.nih.gov/pubmed/36981310 http://dx.doi.org/10.3390/e25030421
work_keys_str_mv	AT zhangcong gfinetglobalfeatureinteractionnetworkformonoculardepthestimation AT xuke gfinetglobalfeatureinteractionnetworkformonoculardepthestimation AT mayanxin gfinetglobalfeatureinteractionnetworkformonoculardepthestimation AT wanjianwei gfinetglobalfeatureinteractionnetworkformonoculardepthestimation

GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation

Ejemplares similares