Cargando…

Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers

It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurat...

Descripción completa

Detalles Bibliográficos
Autores principales: Choi, Youn-Ho, Kee, Seok-Cheol
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9862419/
https://www.ncbi.nlm.nih.gov/pubmed/36679643
http://dx.doi.org/10.3390/s23020845
_version_ 1784875087788244992
author Choi, Youn-Ho
Kee, Seok-Cheol
author_facet Choi, Youn-Ho
Kee, Seok-Cheol
author_sort Choi, Youn-Ho
collection PubMed
description It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurate depths using various techniques is being conducted. However, depth estimation from 2D images has been a problem in predicting the boundary between objects. In this paper, we aim to predict sophisticated depths by emphasizing the precise boundaries between objects. We propose a depth estimation network with encoder–decoder structures using the Laplacian pyramid and local planar guidance method. In the process of upsampling the learned features using the encoder, the purpose of this step is to obtain a clearer depth map by guiding a more sophisticated boundary of an object using the Laplacian pyramid and local planar guidance techniques. We train and test our models with KITTI and NYU Depth V2 datasets. The proposed network constructs a DNN using only convolution and uses the ConvNext networks as a backbone. A trained model shows the performance of the absolute relative error (Abs_rel) 0.054 and root mean square error (RMSE) 2.252 based on the KITTI dataset and absolute relative error (Abs_rel) 0.102 and root mean square error 0.355 based on the NYU Depth V2 dataset. On the state-of-the-art monocular depth estimation, our network performance shows the fifth-best performance based on the KITTI Eigen split and the eighth-best performance based on the NYU Depth V2.
format Online
Article
Text
id pubmed-9862419
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98624192023-01-22 Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers Choi, Youn-Ho Kee, Seok-Cheol Sensors (Basel) Article It is important to estimate the exact depth from 2D images, and many studies have been conducted for a long period of time to solve depth estimation problems. Recently, as research on estimating depth from monocular camera images based on deep learning is progressing, research for estimating accurate depths using various techniques is being conducted. However, depth estimation from 2D images has been a problem in predicting the boundary between objects. In this paper, we aim to predict sophisticated depths by emphasizing the precise boundaries between objects. We propose a depth estimation network with encoder–decoder structures using the Laplacian pyramid and local planar guidance method. In the process of upsampling the learned features using the encoder, the purpose of this step is to obtain a clearer depth map by guiding a more sophisticated boundary of an object using the Laplacian pyramid and local planar guidance techniques. We train and test our models with KITTI and NYU Depth V2 datasets. The proposed network constructs a DNN using only convolution and uses the ConvNext networks as a backbone. A trained model shows the performance of the absolute relative error (Abs_rel) 0.054 and root mean square error (RMSE) 2.252 based on the KITTI dataset and absolute relative error (Abs_rel) 0.102 and root mean square error 0.355 based on the NYU Depth V2 dataset. On the state-of-the-art monocular depth estimation, our network performance shows the fifth-best performance based on the KITTI Eigen split and the eighth-best performance based on the NYU Depth V2. MDPI 2023-01-11 /pmc/articles/PMC9862419/ /pubmed/36679643 http://dx.doi.org/10.3390/s23020845 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Choi, Youn-Ho
Kee, Seok-Cheol
Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
title Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
title_full Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
title_fullStr Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
title_full_unstemmed Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
title_short Monocular Depth Estimation Using a Laplacian Image Pyramid with Local Planar Guidance Layers
title_sort monocular depth estimation using a laplacian image pyramid with local planar guidance layers
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9862419/
https://www.ncbi.nlm.nih.gov/pubmed/36679643
http://dx.doi.org/10.3390/s23020845
work_keys_str_mv AT choiyounho monoculardepthestimationusingalaplacianimagepyramidwithlocalplanarguidancelayers
AT keeseokcheol monoculardepthestimationusingalaplacianimagepyramidwithlocalplanarguidancelayers