Cargando…

Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †

Depth estimation based on light field imaging is a new methodology that has succeeded the traditional binocular stereo matching and depth from monocular images. Significant progress has been made in light-field depth estimation. Nevertheless, the balance between computational time and the accuracy o...

Descripción completa

Detalles Bibliográficos
Autores principales: Han, Lei, Huang, Xiaohua, Shi, Zhan, Zheng, Shengnan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8471881/
https://www.ncbi.nlm.nih.gov/pubmed/34577268
http://dx.doi.org/10.3390/s21186061
_version_ 1784574581926789120
author Han, Lei
Huang, Xiaohua
Shi, Zhan
Zheng, Shengnan
author_facet Han, Lei
Huang, Xiaohua
Shi, Zhan
Zheng, Shengnan
author_sort Han, Lei
collection PubMed
description Depth estimation based on light field imaging is a new methodology that has succeeded the traditional binocular stereo matching and depth from monocular images. Significant progress has been made in light-field depth estimation. Nevertheless, the balance between computational time and the accuracy of depth estimation is still worth exploring. The geometry in light field imaging is the basis of depth estimation, and the abundant light-field data provides convenience for applying deep learning algorithms. The Epipolar Plane Image (EPI) generated from the light-field data has a line texture containing geometric information. The slope of the line is proportional to the depth of the corresponding object. Considering the light field depth estimation as a spatial density prediction task, we design a convolutional neural network (ESTNet) to estimate the accurate depth quickly. Inspired by the strong image feature extraction ability of convolutional neural networks, especially for texture images, we propose to generate EPI synthetic images from light field data as the input of ESTNet to improve the effect of feature extraction and depth estimation. The architecture of ESTNet is characterized by three input streams, encoding-decoding structure, and skipconnections. The three input streams receive horizontal EPI synthetic image (EPIh), vertical EPI synthetic image (EPIv), and central view image (CV), respectively. EPIh and EPIv contain rich texture and depth cues, while CV provides pixel position association information. ESTNet consists of two stages: encoding and decoding. The encoding stage includes several convolution modules, and correspondingly, the decoding stage embodies some transposed convolution modules. In addition to the forward propagation of the network ESTNet, some skip-connections are added between the convolution module and the corresponding transposed convolution module to fuse the shallow local and deep semantic features. ESTNet is trained on one part of a synthetic light-field dataset and then tested on another part of the synthetic light-field dataset and real light-field dataset. Ablation experiments show that our ESTNet structure is reasonable. Experiments on the synthetic light-field dataset and real light-field dataset show that our ESTNet can balance the accuracy of depth estimation and computational time.
format Online
Article
Text
id pubmed-8471881
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-84718812021-09-28 Depth Estimation from Light Field Geometry Using Convolutional Neural Networks † Han, Lei Huang, Xiaohua Shi, Zhan Zheng, Shengnan Sensors (Basel) Article Depth estimation based on light field imaging is a new methodology that has succeeded the traditional binocular stereo matching and depth from monocular images. Significant progress has been made in light-field depth estimation. Nevertheless, the balance between computational time and the accuracy of depth estimation is still worth exploring. The geometry in light field imaging is the basis of depth estimation, and the abundant light-field data provides convenience for applying deep learning algorithms. The Epipolar Plane Image (EPI) generated from the light-field data has a line texture containing geometric information. The slope of the line is proportional to the depth of the corresponding object. Considering the light field depth estimation as a spatial density prediction task, we design a convolutional neural network (ESTNet) to estimate the accurate depth quickly. Inspired by the strong image feature extraction ability of convolutional neural networks, especially for texture images, we propose to generate EPI synthetic images from light field data as the input of ESTNet to improve the effect of feature extraction and depth estimation. The architecture of ESTNet is characterized by three input streams, encoding-decoding structure, and skipconnections. The three input streams receive horizontal EPI synthetic image (EPIh), vertical EPI synthetic image (EPIv), and central view image (CV), respectively. EPIh and EPIv contain rich texture and depth cues, while CV provides pixel position association information. ESTNet consists of two stages: encoding and decoding. The encoding stage includes several convolution modules, and correspondingly, the decoding stage embodies some transposed convolution modules. In addition to the forward propagation of the network ESTNet, some skip-connections are added between the convolution module and the corresponding transposed convolution module to fuse the shallow local and deep semantic features. ESTNet is trained on one part of a synthetic light-field dataset and then tested on another part of the synthetic light-field dataset and real light-field dataset. Ablation experiments show that our ESTNet structure is reasonable. Experiments on the synthetic light-field dataset and real light-field dataset show that our ESTNet can balance the accuracy of depth estimation and computational time. MDPI 2021-09-10 /pmc/articles/PMC8471881/ /pubmed/34577268 http://dx.doi.org/10.3390/s21186061 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Han, Lei
Huang, Xiaohua
Shi, Zhan
Zheng, Shengnan
Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †
title Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †
title_full Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †
title_fullStr Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †
title_full_unstemmed Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †
title_short Depth Estimation from Light Field Geometry Using Convolutional Neural Networks †
title_sort depth estimation from light field geometry using convolutional neural networks †
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8471881/
https://www.ncbi.nlm.nih.gov/pubmed/34577268
http://dx.doi.org/10.3390/s21186061
work_keys_str_mv AT hanlei depthestimationfromlightfieldgeometryusingconvolutionalneuralnetworks
AT huangxiaohua depthestimationfromlightfieldgeometryusingconvolutionalneuralnetworks
AT shizhan depthestimationfromlightfieldgeometryusingconvolutionalneuralnetworks
AT zhengshengnan depthestimationfromlightfieldgeometryusingconvolutionalneuralnetworks