Cargando…
GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation
Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047826/ https://www.ncbi.nlm.nih.gov/pubmed/36981310 http://dx.doi.org/10.3390/e25030421 |
_version_ | 1785014024097759232 |
---|---|
author | Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei |
author_facet | Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei |
author_sort | Zhang, Cong |
collection | PubMed |
description | Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local–global feature fusion module is designed to improve the depth map’s representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane. |
format | Online Article Text |
id | pubmed-10047826 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100478262023-03-29 GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei Entropy (Basel) Article Monocular depth estimation techniques are used to recover the distance from the target to the camera plane in an image scene. However, there are still several problems, such as insufficient estimation accuracy, the inaccurate localization of details, and depth discontinuity in planes parallel to the camera plane. To solve these problems, we propose the Global Feature Interaction Network (GFI-Net), which aims to utilize geometric features, such as object locations and vanishing points, on a global scale. In order to capture the interactive information of the width, height, and channel of the feature graph and expand the global information in the network, we designed a global interactive attention mechanism. The global interactive attention mechanism reduces the loss of pixel information and improves the performance of depth estimation. Furthermore, the encoder uses the Transformer to reduce coding losses and improve the accuracy of depth estimation. Finally, a local–global feature fusion module is designed to improve the depth map’s representation of detailed areas. The experimental results on the NYU-Depth-v2 dataset and the KITTI dataset showed that our model achieved state-of-the-art performance with full detail recovery and depth continuation on the same plane. MDPI 2023-02-26 /pmc/articles/PMC10047826/ /pubmed/36981310 http://dx.doi.org/10.3390/e25030421 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhang, Cong Xu, Ke Ma, Yanxin Wan, Jianwei GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation |
title | GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation |
title_full | GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation |
title_fullStr | GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation |
title_full_unstemmed | GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation |
title_short | GFI-Net: Global Feature Interaction Network for Monocular Depth Estimation |
title_sort | gfi-net: global feature interaction network for monocular depth estimation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047826/ https://www.ncbi.nlm.nih.gov/pubmed/36981310 http://dx.doi.org/10.3390/e25030421 |
work_keys_str_mv | AT zhangcong gfinetglobalfeatureinteractionnetworkformonoculardepthestimation AT xuke gfinetglobalfeatureinteractionnetworkformonoculardepthestimation AT mayanxin gfinetglobalfeatureinteractionnetworkformonoculardepthestimation AT wanjianwei gfinetglobalfeatureinteractionnetworkformonoculardepthestimation |