Cargando…

GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation

Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Cong, Xu, Ke, Ma, Yanxin, Wan, Jianwei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047826/
https://www.ncbi.nlm.nih.gov/pubmed/36981310
http://dx.doi.org/10.3390/e25030421
_version_ 1785014024097759232
author Zhang, Cong
Xu, Ke
Ma, Yanxin
Wan, Jianwei
author_facet Zhang, Cong
Xu, Ke
Ma, Yanxin
Wan, Jianwei
author_sort Zhang, Cong
collection PubMed
description Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local–global feature fusion module is designed to improve the depth map’s representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane.
format Online
Article
Text
id pubmed-10047826
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100478262023-03-29 GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei Entropy (Basel) Article Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local–global feature fusion module is designed to improve the depth map’s representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane. MDPI 2023-02-26 /pmc/articles/PMC10047826/ /pubmed/36981310 http://dx.doi.org/10.3390/e25030421 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Cong
Xu, Ke
Ma, Yanxin
Wan, Jianwei
GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_full GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_fullStr GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_full_unstemmed GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_short GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
title_sort gfi-net: global feature interaction network for monocular depth estimation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047826/
https://www.ncbi.nlm.nih.gov/pubmed/36981310
http://dx.doi.org/10.3390/e25030421
work_keys_str_mv AT zhangcong gfinetglobalfeatureinteractionnetworkformonoculardepthestimation
AT xuke gfinetglobalfeatureinteractionnetworkformonoculardepthestimation
AT mayanxin gfinetglobalfeatureinteractionnetworkformonoculardepthestimation
AT wanjianwei gfinetglobalfeatureinteractionnetworkformonoculardepthestimation