Cargando…

Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints

This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising result...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Xudong, Zhao, Baigan, Yao, Jiannan, Wu, Guoqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255976/
https://www.ncbi.nlm.nih.gov/pubmed/37300056
http://dx.doi.org/10.3390/s23115329
_version_ 1785057002676813824
author Zhang, Xudong
Zhao, Baigan
Yao, Jiannan
Wu, Guoqing
author_facet Zhang, Xudong
Zhao, Baigan
Yao, Jiannan
Wu, Guoqing
author_sort Zhang, Xudong
collection PubMed
description This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising results, their performance suffers in challenging scenes such as those with dynamic objects and occluded regions. As a result, multiple mask technologies and geometric consistency constraints are adopted in this research to mitigate their negative impacts. Firstly, multiple mask technologies are used to identify numerous outliers in the scene, which are excluded from the loss computation. In addition, the identified outliers are employed as a supervised signal to train a mask estimation network. The estimated mask is then utilized to preprocess the input to the pose estimation network, mitigating the potential adverse effects of challenging scenes on pose estimation. Furthermore, we propose geometric consistency constraints to reduce the sensitivity of illumination changes, which act as additional supervised signals to train the network. Experimental results on the KITTI dataset demonstrate that our proposed strategies can effectively enhance the model’s performance, outperforming other unsupervised methods.
format Online
Article
Text
id pubmed-10255976
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102559762023-06-10 Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints Zhang, Xudong Zhao, Baigan Yao, Jiannan Wu, Guoqing Sensors (Basel) Article This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising results, their performance suffers in challenging scenes such as those with dynamic objects and occluded regions. As a result, multiple mask technologies and geometric consistency constraints are adopted in this research to mitigate their negative impacts. Firstly, multiple mask technologies are used to identify numerous outliers in the scene, which are excluded from the loss computation. In addition, the identified outliers are employed as a supervised signal to train a mask estimation network. The estimated mask is then utilized to preprocess the input to the pose estimation network, mitigating the potential adverse effects of challenging scenes on pose estimation. Furthermore, we propose geometric consistency constraints to reduce the sensitivity of illumination changes, which act as additional supervised signals to train the network. Experimental results on the KITTI dataset demonstrate that our proposed strategies can effectively enhance the model’s performance, outperforming other unsupervised methods. MDPI 2023-06-04 /pmc/articles/PMC10255976/ /pubmed/37300056 http://dx.doi.org/10.3390/s23115329 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Xudong
Zhao, Baigan
Yao, Jiannan
Wu, Guoqing
Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
title Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
title_full Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
title_fullStr Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
title_full_unstemmed Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
title_short Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
title_sort unsupervised monocular depth and camera pose estimation with multiple masks and geometric consistency constraints
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255976/
https://www.ncbi.nlm.nih.gov/pubmed/37300056
http://dx.doi.org/10.3390/s23115329
work_keys_str_mv AT zhangxudong unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints
AT zhaobaigan unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints
AT yaojiannan unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints
AT wuguoqing unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints