Cargando…

Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor

We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization....

Descripción completa

Detalles Bibliográficos
Autores principales:	Xiong, Lu, Wen, Yongkun, Huang, Yuyao, Zhao, Junqiao, Tian, Wei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374458/ https://www.ncbi.nlm.nih.gov/pubmed/32635370 http://dx.doi.org/10.3390/s20133737

_version_	1783561703795982336
author	Xiong, Lu Wen, Yongkun Huang, Yuyao Zhao, Junqiao Tian, Wei
author_facet	Xiong, Lu Wen, Yongkun Huang, Yuyao Zhao, Junqiao Tian, Wei
author_sort	Xiong, Lu
collection	PubMed
description	We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization. Specifically, we use the mutual information loss to pre-train the ground segmentation network and before adding the corresponding self-learning label obtained by a geometric method. By using the static nature of the ground and its normal vector, the scene depth and ego-motion can be efficiently learned by the self-supervised learning procedure. Extensive experimental results on both Cityscapes and KITTI benchmark demonstrate the significant improvement on the estimation accuracy for both scene depth and ego-pose by our approach. We also achieve an average error of about 3 [Formula: see text] for estimated ground normal vectors. By deploying our proposed geometric constraints, the IOU accuracy of unsupervised ground segmentation is increased by 35% on the Cityscapes dataset.
format	Online Article Text
id	pubmed-7374458
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-73744582020-08-05 Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor Xiong, Lu Wen, Yongkun Huang, Yuyao Zhao, Junqiao Tian, Wei Sensors (Basel) Article We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization. Specifically, we use the mutual information loss to pre-train the ground segmentation network and before adding the corresponding self-learning label obtained by a geometric method. By using the static nature of the ground and its normal vector, the scene depth and ego-motion can be efficiently learned by the self-supervised learning procedure. Extensive experimental results on both Cityscapes and KITTI benchmark demonstrate the significant improvement on the estimation accuracy for both scene depth and ego-pose by our approach. We also achieve an average error of about 3 [Formula: see text] for estimated ground normal vectors. By deploying our proposed geometric constraints, the IOU accuracy of unsupervised ground segmentation is increased by 35% on the Cityscapes dataset. MDPI 2020-07-03 /pmc/articles/PMC7374458/ /pubmed/32635370 http://dx.doi.org/10.3390/s20133737 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Xiong, Lu Wen, Yongkun Huang, Yuyao Zhao, Junqiao Tian, Wei Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
title	Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
title_full	Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
title_fullStr	Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
title_full_unstemmed	Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
title_short	Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
title_sort	joint unsupervised learning of depth, pose, ground normal vector and ground segmentation by a monocular camera sensor
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374458/ https://www.ncbi.nlm.nih.gov/pubmed/32635370 http://dx.doi.org/10.3390/s20133737
work_keys_str_mv	AT xionglu jointunsupervisedlearningofdepthposegroundnormalvectorandgroundsegmentationbyamonocularcamerasensor AT wenyongkun jointunsupervisedlearningofdepthposegroundnormalvectorandgroundsegmentationbyamonocularcamerasensor AT huangyuyao jointunsupervisedlearningofdepthposegroundnormalvectorandgroundsegmentationbyamonocularcamerasensor AT zhaojunqiao jointunsupervisedlearningofdepthposegroundnormalvectorandgroundsegmentationbyamonocularcamerasensor AT tianwei jointunsupervisedlearningofdepthposegroundnormalvectorandgroundsegmentationbyamonocularcamerasensor

Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor

Ejemplares similares