Cargando…
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurat...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9862419/ https://www.ncbi.nlm.nih.gov/pubmed/36679643 http://dx.doi.org/10.3390/s23020845 |
_version_ | 1784875087788244992 |
---|---|
author | Choi, Youn-Ho Kee, Seok-Cheol |
author_facet | Choi, Youn-Ho Kee, Seok-Cheol |
author_sort | Choi, Youn-Ho |
collection | PubMed |
description | It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurate depths using various techniques is being conducted. However, depth estimation from 2D images has been a problem in predicting the boundary between objects. In this paper, we aim to predict sophisticated depths by emphasizing the precise boundaries between objects. We propose a depth estimation network with encoder–decoder structures using the Laplacian pyramid and local planar guidance method. In the process of upsampling the learned features using the encoder, the purpose of this step is to obtain a clearer depth map by guiding a more sophisticated boundary of an object using the Laplacian pyramid and local planar guidance techniques. We train and test our models with KITTI and NYU Depth V2 datasets. The proposed network constructs a DNN using only convolution and uses the ConvNext networks as a backbone. A trained model shows the performance of the absolute relative error (Abs_rel) 0.054 and root mean square error (RMSE) 2.252 based on the KITTI dataset and absolute relative error (Abs_rel) 0.102 and root mean square error 0.355 based on the NYU Depth V2 dataset. On the state-of-the-art monocular depth estimation, our network performance shows the fifth-best performance based on the KITTI Eigen split and the eighth-best performance based on the NYU Depth V2. |
format | Online Article Text |
id | pubmed-9862419 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-98624192023-01-22 Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers Choi, Youn-Ho Kee, Seok-Cheol Sensors (Basel) Article It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurate depths using various techniques is being conducted. However, depth estimation from 2D images has been a problem in predicting the boundary between objects. In this paper, we aim to predict sophisticated depths by emphasizing the precise boundaries between objects. We propose a depth estimation network with encoder–decoder structures using the Laplacian pyramid and local planar guidance method. In the process of upsampling the learned features using the encoder, the purpose of this step is to obtain a clearer depth map by guiding a more sophisticated boundary of an object using the Laplacian pyramid and local planar guidance techniques. We train and test our models with KITTI and NYU Depth V2 datasets. The proposed network constructs a DNN using only convolution and uses the ConvNext networks as a backbone. A trained model shows the performance of the absolute relative error (Abs_rel) 0.054 and root mean square error (RMSE) 2.252 based on the KITTI dataset and absolute relative error (Abs_rel) 0.102 and root mean square error 0.355 based on the NYU Depth V2 dataset. On the state-of-the-art monocular depth estimation, our network performance shows the fifth-best performance based on the KITTI Eigen split and the eighth-best performance based on the NYU Depth V2. MDPI 2023-01-11 /pmc/articles/PMC9862419/ /pubmed/36679643 http://dx.doi.org/10.3390/s23020845 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Choi, Youn-Ho Kee, Seok-Cheol Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers |
title | Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers |
title_full | Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers |
title_fullStr | Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers |
title_full_unstemmed | Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers |
title_short | Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers |
title_sort | monocular depth estimation using a laplacian image pyramid with local planar guidance layers |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9862419/ https://www.ncbi.nlm.nih.gov/pubmed/36679643 http://dx.doi.org/10.3390/s23020845 |
work_keys_str_mv | AT choiyounho monoculardepthestimationusingalaplacianimagepyramidwithlocalplanarguidancelayers AT keeseokcheol monoculardepthestimationusingalaplacianimagepyramidwithlocalplanarguidancelayers |