Cargando…
Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints
This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising result...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255976/ https://www.ncbi.nlm.nih.gov/pubmed/37300056 http://dx.doi.org/10.3390/s23115329 |
_version_ | 1785057002676813824 |
---|---|
author | Zhang, Xudong Zhao, Baigan Yao, Jiannan Wu, Guoqing |
author_facet | Zhang, Xudong Zhao, Baigan Yao, Jiannan Wu, Guoqing |
author_sort | Zhang, Xudong |
collection | PubMed |
description | This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising results, their performance suffers in challenging scenes such as those with dynamic objects and occluded regions. As a result, multiple mask technologies and geometric consistency constraints are adopted in this research to mitigate their negative impacts. Firstly, multiple mask technologies are used to identify numerous outliers in the scene, which are excluded from the loss computation. In addition, the identified outliers are employed as a supervised signal to train a mask estimation network. The estimated mask is then utilized to preprocess the input to the pose estimation network, mitigating the potential adverse effects of challenging scenes on pose estimation. Furthermore, we propose geometric consistency constraints to reduce the sensitivity of illumination changes, which act as additional supervised signals to train the network. Experimental results on the KITTI dataset demonstrate that our proposed strategies can effectively enhance the model’s performance, outperforming other unsupervised methods. |
format | Online Article Text |
id | pubmed-10255976 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-102559762023-06-10 Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints Zhang, Xudong Zhao, Baigan Yao, Jiannan Wu, Guoqing Sensors (Basel) Article This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising results, their performance suffers in challenging scenes such as those with dynamic objects and occluded regions. As a result, multiple mask technologies and geometric consistency constraints are adopted in this research to mitigate their negative impacts. Firstly, multiple mask technologies are used to identify numerous outliers in the scene, which are excluded from the loss computation. In addition, the identified outliers are employed as a supervised signal to train a mask estimation network. The estimated mask is then utilized to preprocess the input to the pose estimation network, mitigating the potential adverse effects of challenging scenes on pose estimation. Furthermore, we propose geometric consistency constraints to reduce the sensitivity of illumination changes, which act as additional supervised signals to train the network. Experimental results on the KITTI dataset demonstrate that our proposed strategies can effectively enhance the model’s performance, outperforming other unsupervised methods. MDPI 2023-06-04 /pmc/articles/PMC10255976/ /pubmed/37300056 http://dx.doi.org/10.3390/s23115329 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zhang, Xudong Zhao, Baigan Yao, Jiannan Wu, Guoqing Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints |
title | Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints |
title_full | Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints |
title_fullStr | Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints |
title_full_unstemmed | Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints |
title_short | Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints |
title_sort | unsupervised monocular depth and camera pose estimation with multiple masks and geometric consistency constraints |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10255976/ https://www.ncbi.nlm.nih.gov/pubmed/37300056 http://dx.doi.org/10.3390/s23115329 |
work_keys_str_mv | AT zhangxudong unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints AT zhaobaigan unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints AT yaojiannan unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints AT wuguoqing unsupervisedmonoculardepthandcameraposeestimationwithmultiplemasksandgeometricconsistencyconstraints |