Cargando…

Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning

Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Sumin, Lu, Shouyi, He, Rui, Bao, Zhipeng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8309519/ https://www.ncbi.nlm.nih.gov/pubmed/34300475 http://dx.doi.org/10.3390/s21144735

_version_	1783728540723707904
author	Zhang, Sumin Lu, Shouyi He, Rui Bao, Zhipeng
author_facet	Zhang, Sumin Lu, Shouyi He, Rui Bao, Zhipeng
author_sort	Zhang, Sumin
collection	PubMed
description	Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can hardly operate in challenging environments. To solve this challenge, we combine the multiview geometry constraints of the classical stereo VO system with the robustness of deep learning to present an unsupervised pose correction network for the classical stereo VO system. The pose correction network regresses a pose correction that results in positioning error due to violation of modeling assumptions to make the classical stereo VO positioning more accurate. The pose correction network does not rely on the dataset with ground truth poses for training. The pose correction network also simultaneously generates a depth map and an explainability mask. Extensive experiments on the KITTI dataset show the pose correction network can significantly improve the positioning accuracy of the classical stereo VO system. Notably, the corrected classical stereo VO system’s average absolute trajectory error, average translational relative pose error, and average translational root-mean-square drift on a length of 100–800 m in the KITTI dataset is 13.77 cm, 0.038 m, and 1.08%, respectively. Therefore, the improved stereo VO system has almost reached the state of the art.
format	Online Article Text
id	pubmed-8309519
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-83095192021-07-25 Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning Zhang, Sumin Lu, Shouyi He, Rui Bao, Zhipeng Sensors (Basel) Article Visual simultaneous localization and mapping (VSLAM) plays a vital role in the field of positioning and navigation. At the heart of VSLAM is visual odometry (VO), which uses continuous images to estimate the camera’s ego-motion. However, due to many assumptions of the classical VO system, robots can hardly operate in challenging environments. To solve this challenge, we combine the multiview geometry constraints of the classical stereo VO system with the robustness of deep learning to present an unsupervised pose correction network for the classical stereo VO system. The pose correction network regresses a pose correction that results in positioning error due to violation of modeling assumptions to make the classical stereo VO positioning more accurate. The pose correction network does not rely on the dataset with ground truth poses for training. The pose correction network also simultaneously generates a depth map and an explainability mask. Extensive experiments on the KITTI dataset show the pose correction network can significantly improve the positioning accuracy of the classical stereo VO system. Notably, the corrected classical stereo VO system’s average absolute trajectory error, average translational relative pose error, and average translational root-mean-square drift on a length of 100–800 m in the KITTI dataset is 13.77 cm, 0.038 m, and 1.08%, respectively. Therefore, the improved stereo VO system has almost reached the state of the art. MDPI 2021-07-11 /pmc/articles/PMC8309519/ /pubmed/34300475 http://dx.doi.org/10.3390/s21144735 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhang, Sumin Lu, Shouyi He, Rui Bao, Zhipeng Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_full	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_fullStr	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_full_unstemmed	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_short	Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning
title_sort	stereo visual odometry pose correction through unsupervised deep learning
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8309519/ https://www.ncbi.nlm.nih.gov/pubmed/34300475 http://dx.doi.org/10.3390/s21144735
work_keys_str_mv	AT zhangsumin stereovisualodometryposecorrectionthroughunsuperviseddeeplearning AT lushouyi stereovisualodometryposecorrectionthroughunsuperviseddeeplearning AT herui stereovisualodometryposecorrectionthroughunsuperviseddeeplearning AT baozhipeng stereovisualodometryposecorrectionthroughunsuperviseddeeplearning

Stereo Visual Odometry Pose Correction through Unsupervised Deep Learning

Ejemplares similares