Cargando…

Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation

Depth estimation is an inverse projection problem that estimates pixel-level distances from a single image. Although, supervised methods have shown promising results, it has intrinsic limitations in requiring ground truth depth from an external sensor. On the other hand, self-supervised depth estima...

Descripción completa

Detalles Bibliográficos
Autores principales:	Song, Jimin, Lee, Sang Jun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622578/ https://www.ncbi.nlm.nih.gov/pubmed/37919392 http://dx.doi.org/10.1038/s41598-023-46178-w

_version_	1785130570651533312
author	Song, Jimin Lee, Sang Jun
author_facet	Song, Jimin Lee, Sang Jun
author_sort	Song, Jimin
collection	PubMed
description	Depth estimation is an inverse projection problem that estimates pixel-level distances from a single image. Although, supervised methods have shown promising results, it has intrinsic limitations in requiring ground truth depth from an external sensor. On the other hand, self-supervised depth estimation relieves the burden for collecting calibrated training data, while there is still a large performance gap between supervised and self-supervised methods. The objective of this study is to reduce the performance gap between the supervised and self-supervised approaches. The loss function of previous self-supervised methods is mainly based on a photometric error, which is indirectly computed from synthesized images using depth and pose estimates. In this paper, we argue that direct depth cue is more effective to train a depth estimation network. To obtain the direct depth cue, we employed a knowledge distillation technique, which is a teacher-student learning framework. The teacher network was trained in a self-supervised manner based on a photometric error, and its predictions were utilized to train a student network. We constructed a multi-scale dense prediction transformer with Monte Carlo dropout, and multi-scale distillation loss was proposed to train the student network based on the ensemble of stochastic estimates. Experiments were conducted on the KITTI and Make3D datasets, and our proposed method achieved the state-of-the-art accuracy in self-supervised depth estimation. Our code is publicly available at https://github.com/ji-min-song/KD-of-MS-DPT.
format	Online Article Text
id	pubmed-10622578
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-106225782023-11-04 Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation Song, Jimin Lee, Sang Jun Sci Rep Article Depth estimation is an inverse projection problem that estimates pixel-level distances from a single image. Although, supervised methods have shown promising results, it has intrinsic limitations in requiring ground truth depth from an external sensor. On the other hand, self-supervised depth estimation relieves the burden for collecting calibrated training data, while there is still a large performance gap between supervised and self-supervised methods. The objective of this study is to reduce the performance gap between the supervised and self-supervised approaches. The loss function of previous self-supervised methods is mainly based on a photometric error, which is indirectly computed from synthesized images using depth and pose estimates. In this paper, we argue that direct depth cue is more effective to train a depth estimation network. To obtain the direct depth cue, we employed a knowledge distillation technique, which is a teacher-student learning framework. The teacher network was trained in a self-supervised manner based on a photometric error, and its predictions were utilized to train a student network. We constructed a multi-scale dense prediction transformer with Monte Carlo dropout, and multi-scale distillation loss was proposed to train the student network based on the ensemble of stochastic estimates. Experiments were conducted on the KITTI and Make3D datasets, and our proposed method achieved the state-of-the-art accuracy in self-supervised depth estimation. Our code is publicly available at https://github.com/ji-min-song/KD-of-MS-DPT. Nature Publishing Group UK 2023-11-02 /pmc/articles/PMC10622578/ /pubmed/37919392 http://dx.doi.org/10.1038/s41598-023-46178-w Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Song, Jimin Lee, Sang Jun Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
title	Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
title_full	Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
title_fullStr	Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
title_full_unstemmed	Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
title_short	Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
title_sort	knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10622578/ https://www.ncbi.nlm.nih.gov/pubmed/37919392 http://dx.doi.org/10.1038/s41598-023-46178-w
work_keys_str_mv	AT songjimin knowledgedistillationofmultiscaledensepredictiontransformerforselfsuperviseddepthestimation AT leesangjun knowledgedistillationofmultiscaledensepredictiontransformerforselfsuperviseddepthestimation

Knowledge distillation of multi-scale dense prediction transformer for self-supervised depth estimation

Ejemplares similares