Cargando…
Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor
We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization....
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374458/ https://www.ncbi.nlm.nih.gov/pubmed/32635370 http://dx.doi.org/10.3390/s20133737 |
Sumario: | We propose a completely unsupervised approach to simultaneously estimate scene depth, ego-pose, ground segmentation and ground normal vector from only monocular RGB video sequences. In our approach, estimation for different scene structures can mutually benefit each other by the joint optimization. Specifically, we use the mutual information loss to pre-train the ground segmentation network and before adding the corresponding self-learning label obtained by a geometric method. By using the static nature of the ground and its normal vector, the scene depth and ego-motion can be efficiently learned by the self-supervised learning procedure. Extensive experimental results on both Cityscapes and KITTI benchmark demonstrate the significant improvement on the estimation accuracy for both scene depth and ego-pose by our approach. We also achieve an average error of about 3 [Formula: see text] for estimated ground normal vectors. By deploying our proposed geometric constraints, the IOU accuracy of unsupervised ground segmentation is increased by 35% on the Cityscapes dataset. |
---|