Cargando…

Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification

Classification of indoor environments is a challenging problem. The availability of low-cost depth sensors has opened up a new research area of using depth information in addition to color image (RGB) data for scene understanding. Transfer learning of deep convolutional networks with pairs of RGB an...

Descripción completa

Detalles Bibliográficos
Autores principales:	Gopalapillai, Radhakrishnan, Gupta, Deepa, Zakariah, Mohammed, Alotaibi, Yousef Ajami
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8659746/ https://www.ncbi.nlm.nih.gov/pubmed/34883955 http://dx.doi.org/10.3390/s21237950

_version_	1784613037701857280
author	Gopalapillai, Radhakrishnan Gupta, Deepa Zakariah, Mohammed Alotaibi, Yousef Ajami
author_facet	Gopalapillai, Radhakrishnan Gupta, Deepa Zakariah, Mohammed Alotaibi, Yousef Ajami
author_sort	Gopalapillai, Radhakrishnan
collection	PubMed
description	Classification of indoor environments is a challenging problem. The availability of low-cost depth sensors has opened up a new research area of using depth information in addition to color image (RGB) data for scene understanding. Transfer learning of deep convolutional networks with pairs of RGB and depth (RGB-D) images has to deal with integrating these two modalities. Single-channel depth images are often converted to three-channel images by extracting horizontal disparity, height above ground, and the angle of the pixel’s local surface normal (HHA) to apply transfer learning using networks trained on the Places365 dataset. The high computational cost of HHA encoding can be a major disadvantage for the real-time prediction of scenes, although this may be less important during the training phase. We propose a new, computationally efficient encoding method that can be integrated with any convolutional neural network. We show that our encoding approach performs equally well or better in a multimodal transfer learning setup for scene classification. Our encoding is implemented in a customized and pretrained VGG16 Net. We address the class imbalance problem seen in the image dataset using a method based on the synthetic minority oversampling technique (SMOTE) at the feature level. With appropriate image augmentation and fine-tuning, our network achieves scene classification accuracy comparable to that of other state-of-the-art architectures.
format	Online Article Text
id	pubmed-8659746
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-86597462021-12-10 Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification Gopalapillai, Radhakrishnan Gupta, Deepa Zakariah, Mohammed Alotaibi, Yousef Ajami Sensors (Basel) Article Classification of indoor environments is a challenging problem. The availability of low-cost depth sensors has opened up a new research area of using depth information in addition to color image (RGB) data for scene understanding. Transfer learning of deep convolutional networks with pairs of RGB and depth (RGB-D) images has to deal with integrating these two modalities. Single-channel depth images are often converted to three-channel images by extracting horizontal disparity, height above ground, and the angle of the pixel’s local surface normal (HHA) to apply transfer learning using networks trained on the Places365 dataset. The high computational cost of HHA encoding can be a major disadvantage for the real-time prediction of scenes, although this may be less important during the training phase. We propose a new, computationally efficient encoding method that can be integrated with any convolutional neural network. We show that our encoding approach performs equally well or better in a multimodal transfer learning setup for scene classification. Our encoding is implemented in a customized and pretrained VGG16 Net. We address the class imbalance problem seen in the image dataset using a method based on the synthetic minority oversampling technique (SMOTE) at the feature level. With appropriate image augmentation and fine-tuning, our network achieves scene classification accuracy comparable to that of other state-of-the-art architectures. MDPI 2021-11-28 /pmc/articles/PMC8659746/ /pubmed/34883955 http://dx.doi.org/10.3390/s21237950 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Gopalapillai, Radhakrishnan Gupta, Deepa Zakariah, Mohammed Alotaibi, Yousef Ajami Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification
title	Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification
title_full	Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification
title_fullStr	Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification
title_full_unstemmed	Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification
title_short	Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification
title_sort	convolution-based encoding of depth images for transfer learning in rgb-d scene classification
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8659746/ https://www.ncbi.nlm.nih.gov/pubmed/34883955 http://dx.doi.org/10.3390/s21237950
work_keys_str_mv	AT gopalapillairadhakrishnan convolutionbasedencodingofdepthimagesfortransferlearninginrgbdsceneclassification AT guptadeepa convolutionbasedencodingofdepthimagesfortransferlearninginrgbdsceneclassification AT zakariahmohammed convolutionbasedencodingofdepthimagesfortransferlearninginrgbdsceneclassification AT alotaibiyousefajami convolutionbasedencodingofdepthimagesfortransferlearninginrgbdsceneclassification

Convolution-Based Encoding of Depth Images for Transfer Learning in RGB-D Scene Classification

Ejemplares similares