Cargando…

Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory

Recent studies have applied the superior performance of deep learning to mobile devices, and these studies have enabled the running of the deep learning model on a mobile device with limited computing power. However, there is performance degradation of the deep learning model when it is deployed in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ha, Donghee, Kim, Mooseop, Moon, KyeongDeok, Jeong, Chi Yoon
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8037599/ https://www.ncbi.nlm.nih.gov/pubmed/33805349 http://dx.doi.org/10.3390/s21072364

_version_	1783677181479616512
author	Ha, Donghee Kim, Mooseop Moon, KyeongDeok Jeong, Chi Yoon
author_facet	Ha, Donghee Kim, Mooseop Moon, KyeongDeok Jeong, Chi Yoon
author_sort	Ha, Donghee
collection	PubMed
description	Recent studies have applied the superior performance of deep learning to mobile devices, and these studies have enabled the running of the deep learning model on a mobile device with limited computing power. However, there is performance degradation of the deep learning model when it is deployed in mobile devices, due to the different sensors of each device. To solve this issue, it is necessary to train a network model specific to each mobile device. Therefore, herein, we propose an acceleration method for on-device learning to mitigate the device heterogeneity. The proposed method efficiently utilizes unified memory for reducing the latency of data transfer during network model training. In addition, we propose the layer-wise processor selection method to consider the latency generated by the difference in the processor performing the forward propagation step and the backpropagation step in the same layer. The experiments were performed on an ODROID-XU4 with the ResNet-18 model, and the experimental results indicate that the proposed method reduces the latency by at most 28.4% compared to the central processing unit (CPU) and at most 21.8% compared to the graphics processing unit (GPU). Through experiments using various batch sizes to measure the average power consumption, we confirmed that device heterogeneity is alleviated by performing on-device learning using the proposed method.
format	Online Article Text
id	pubmed-8037599
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-80375992021-04-12 Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory Ha, Donghee Kim, Mooseop Moon, KyeongDeok Jeong, Chi Yoon Sensors (Basel) Article Recent studies have applied the superior performance of deep learning to mobile devices, and these studies have enabled the running of the deep learning model on a mobile device with limited computing power. However, there is performance degradation of the deep learning model when it is deployed in mobile devices, due to the different sensors of each device. To solve this issue, it is necessary to train a network model specific to each mobile device. Therefore, herein, we propose an acceleration method for on-device learning to mitigate the device heterogeneity. The proposed method efficiently utilizes unified memory for reducing the latency of data transfer during network model training. In addition, we propose the layer-wise processor selection method to consider the latency generated by the difference in the processor performing the forward propagation step and the backpropagation step in the same layer. The experiments were performed on an ODROID-XU4 with the ResNet-18 model, and the experimental results indicate that the proposed method reduces the latency by at most 28.4% compared to the central processing unit (CPU) and at most 21.8% compared to the graphics processing unit (GPU). Through experiments using various batch sizes to measure the average power consumption, we confirmed that device heterogeneity is alleviated by performing on-device learning using the proposed method. MDPI 2021-03-29 /pmc/articles/PMC8037599/ /pubmed/33805349 http://dx.doi.org/10.3390/s21072364 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ).
spellingShingle	Article Ha, Donghee Kim, Mooseop Moon, KyeongDeok Jeong, Chi Yoon Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
title	Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
title_full	Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
title_fullStr	Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
title_full_unstemmed	Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
title_short	Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory
title_sort	accelerating on-device learning with layer-wise processor selection method on unified memory
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8037599/ https://www.ncbi.nlm.nih.gov/pubmed/33805349 http://dx.doi.org/10.3390/s21072364
work_keys_str_mv	AT hadonghee acceleratingondevicelearningwithlayerwiseprocessorselectionmethodonunifiedmemory AT kimmooseop acceleratingondevicelearningwithlayerwiseprocessorselectionmethodonunifiedmemory AT moonkyeongdeok acceleratingondevicelearningwithlayerwiseprocessorselectionmethodonunifiedmemory AT jeongchiyoon acceleratingondevicelearningwithlayerwiseprocessorselectionmethodonunifiedmemory

Accelerating On-Device Learning with Layer-Wise Processor Selection Method on Unified Memory

Ejemplares similares