Cargando…
Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization
With the development of deep learning technologies and edge computing, the combination of them can make artificial intelligence ubiquitous. Due to the constrained computation resources of the edge device, the research in the field of on-device deep learning not only focuses on the model accuracy but...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7827625/ https://www.ncbi.nlm.nih.gov/pubmed/33435143 http://dx.doi.org/10.3390/s21020444 |
_version_ | 1783640808463794176 |
---|---|
author | Yang, Zhao Zhang, Shengbing Li, Ruxu Li, Chuxi Wang, Miao Wang, Danghui Zhang, Meng |
author_facet | Yang, Zhao Zhang, Shengbing Li, Ruxu Li, Chuxi Wang, Miao Wang, Danghui Zhang, Meng |
author_sort | Yang, Zhao |
collection | PubMed |
description | With the development of deep learning technologies and edge computing, the combination of them can make artificial intelligence ubiquitous. Due to the constrained computation resources of the edge device, the research in the field of on-device deep learning not only focuses on the model accuracy but also on the model efficiency, for example, inference latency. There are many attempts to optimize the existing deep learning models for the purpose of deploying them on the edge devices that meet specific application requirements while maintaining high accuracy. Such work not only requires professional knowledge but also needs a lot of experiments, which limits the customization of neural networks for varied devices and application scenarios. In order to reduce the human intervention in designing and optimizing the neural network structure, multi-objective neural architecture search methods that can automatically search for neural networks featured with high accuracy and can satisfy certain hardware performance requirements are proposed. However, the current methods commonly set accuracy and inference latency as the performance indicator during the search process, and sample numerous network structures to obtain the required neural network. Lacking regulation to the search direction with the search objectives will generate a large number of useless networks during the search process, which influences the search efficiency to a great extent. Therefore, in this paper, an efficient resource-aware search method is proposed. Firstly, the network inference consumption profiling model for any specific device is established, and it can help us directly obtain the resource consumption of each operation in the network structure and the inference latency of the entire sampled network. Next, on the basis of the Bayesian search, a resource-aware Pareto Bayesian search is proposed. Accuracy and inference latency are set as the constraints to regulate the search direction. With a clearer search direction, the overall search efficiency will be improved. Furthermore, cell-based structure and lightweight operation are applied to optimize the search space for further enhancing the search efficiency. The experimental results demonstrate that with our method, the inference latency of the searched network structure reduced 94.71% without scarifying the accuracy. At the same time, the search efficiency increased by 18.18%. |
format | Online Article Text |
id | pubmed-7827625 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-78276252021-01-25 Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization Yang, Zhao Zhang, Shengbing Li, Ruxu Li, Chuxi Wang, Miao Wang, Danghui Zhang, Meng Sensors (Basel) Article With the development of deep learning technologies and edge computing, the combination of them can make artificial intelligence ubiquitous. Due to the constrained computation resources of the edge device, the research in the field of on-device deep learning not only focuses on the model accuracy but also on the model efficiency, for example, inference latency. There are many attempts to optimize the existing deep learning models for the purpose of deploying them on the edge devices that meet specific application requirements while maintaining high accuracy. Such work not only requires professional knowledge but also needs a lot of experiments, which limits the customization of neural networks for varied devices and application scenarios. In order to reduce the human intervention in designing and optimizing the neural network structure, multi-objective neural architecture search methods that can automatically search for neural networks featured with high accuracy and can satisfy certain hardware performance requirements are proposed. However, the current methods commonly set accuracy and inference latency as the performance indicator during the search process, and sample numerous network structures to obtain the required neural network. Lacking regulation to the search direction with the search objectives will generate a large number of useless networks during the search process, which influences the search efficiency to a great extent. Therefore, in this paper, an efficient resource-aware search method is proposed. Firstly, the network inference consumption profiling model for any specific device is established, and it can help us directly obtain the resource consumption of each operation in the network structure and the inference latency of the entire sampled network. Next, on the basis of the Bayesian search, a resource-aware Pareto Bayesian search is proposed. Accuracy and inference latency are set as the constraints to regulate the search direction. With a clearer search direction, the overall search efficiency will be improved. Furthermore, cell-based structure and lightweight operation are applied to optimize the search space for further enhancing the search efficiency. The experimental results demonstrate that with our method, the inference latency of the searched network structure reduced 94.71% without scarifying the accuracy. At the same time, the search efficiency increased by 18.18%. MDPI 2021-01-10 /pmc/articles/PMC7827625/ /pubmed/33435143 http://dx.doi.org/10.3390/s21020444 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Yang, Zhao Zhang, Shengbing Li, Ruxu Li, Chuxi Wang, Miao Wang, Danghui Zhang, Meng Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization |
title | Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization |
title_full | Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization |
title_fullStr | Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization |
title_full_unstemmed | Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization |
title_short | Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization |
title_sort | efficient resource-aware convolutional neural architecture search for edge computing with pareto-bayesian optimization |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7827625/ https://www.ncbi.nlm.nih.gov/pubmed/33435143 http://dx.doi.org/10.3390/s21020444 |
work_keys_str_mv | AT yangzhao efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization AT zhangshengbing efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization AT liruxu efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization AT lichuxi efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization AT wangmiao efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization AT wangdanghui efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization AT zhangmeng efficientresourceawareconvolutionalneuralarchitecturesearchforedgecomputingwithparetobayesianoptimization |