Cargando…
Multistructure-Based Collaborative Online Distillation
Recently, deep learning has achieved state-of-the-art performance in more aspects than traditional shallow architecture-based machine-learning methods. However, in order to achieve higher accuracy, it is usually necessary to extend the network depth or ensemble the results of different neural networ...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514841/ https://www.ncbi.nlm.nih.gov/pubmed/33267071 http://dx.doi.org/10.3390/e21040357 |
_version_ | 1783586681736134656 |
---|---|
author | Gao, Liang Lan, Xu Mi, Haibo Feng, Dawei Xu, Kele Peng, Yuxing |
author_facet | Gao, Liang Lan, Xu Mi, Haibo Feng, Dawei Xu, Kele Peng, Yuxing |
author_sort | Gao, Liang |
collection | PubMed |
description | Recently, deep learning has achieved state-of-the-art performance in more aspects than traditional shallow architecture-based machine-learning methods. However, in order to achieve higher accuracy, it is usually necessary to extend the network depth or ensemble the results of different neural networks. Increasing network depth or ensembling different networks increases the demand for memory resources and computing resources. This leads to difficulties in deploying depth-learning models in resource-constrained scenarios such as drones, mobile phones, and autonomous driving. Improving network performance without expanding the network scale has become a hot topic for research. In this paper, we propose a cross-architecture online-distillation approach to solve this problem by transmitting supplementary information on different networks. We use the ensemble method to aggregate networks of different structures, thus forming better teachers than traditional distillation methods. In addition, discontinuous distillation with progressively enhanced constraints is used to replace fixed distillation in order to reduce loss of information diversity in the distillation process. Our training method improves the distillation effect and achieves strong network-performance improvement. We used some popular models to validate the results. On the CIFAR100 dataset, AlexNet’s accuracy was improved by 5.94%, VGG by 2.88%, ResNet by 5.07%, and DenseNet by 1.28%. Extensive experiments were conducted to demonstrate the effectiveness of the proposed method. On the CIFAR10, CIFAR100, and ImageNet datasets, we observed significant improvements over traditional knowledge distillation. |
format | Online Article Text |
id | pubmed-7514841 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75148412020-11-09 Multistructure-Based Collaborative Online Distillation Gao, Liang Lan, Xu Mi, Haibo Feng, Dawei Xu, Kele Peng, Yuxing Entropy (Basel) Article Recently, deep learning has achieved state-of-the-art performance in more aspects than traditional shallow architecture-based machine-learning methods. However, in order to achieve higher accuracy, it is usually necessary to extend the network depth or ensemble the results of different neural networks. Increasing network depth or ensembling different networks increases the demand for memory resources and computing resources. This leads to difficulties in deploying depth-learning models in resource-constrained scenarios such as drones, mobile phones, and autonomous driving. Improving network performance without expanding the network scale has become a hot topic for research. In this paper, we propose a cross-architecture online-distillation approach to solve this problem by transmitting supplementary information on different networks. We use the ensemble method to aggregate networks of different structures, thus forming better teachers than traditional distillation methods. In addition, discontinuous distillation with progressively enhanced constraints is used to replace fixed distillation in order to reduce loss of information diversity in the distillation process. Our training method improves the distillation effect and achieves strong network-performance improvement. We used some popular models to validate the results. On the CIFAR100 dataset, AlexNet’s accuracy was improved by 5.94%, VGG by 2.88%, ResNet by 5.07%, and DenseNet by 1.28%. Extensive experiments were conducted to demonstrate the effectiveness of the proposed method. On the CIFAR10, CIFAR100, and ImageNet datasets, we observed significant improvements over traditional knowledge distillation. MDPI 2019-04-02 /pmc/articles/PMC7514841/ /pubmed/33267071 http://dx.doi.org/10.3390/e21040357 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Gao, Liang Lan, Xu Mi, Haibo Feng, Dawei Xu, Kele Peng, Yuxing Multistructure-Based Collaborative Online Distillation |
title | Multistructure-Based Collaborative Online Distillation |
title_full | Multistructure-Based Collaborative Online Distillation |
title_fullStr | Multistructure-Based Collaborative Online Distillation |
title_full_unstemmed | Multistructure-Based Collaborative Online Distillation |
title_short | Multistructure-Based Collaborative Online Distillation |
title_sort | multistructure-based collaborative online distillation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7514841/ https://www.ncbi.nlm.nih.gov/pubmed/33267071 http://dx.doi.org/10.3390/e21040357 |
work_keys_str_mv | AT gaoliang multistructurebasedcollaborativeonlinedistillation AT lanxu multistructurebasedcollaborativeonlinedistillation AT mihaibo multistructurebasedcollaborativeonlinedistillation AT fengdawei multistructurebasedcollaborativeonlinedistillation AT xukele multistructurebasedcollaborativeonlinedistillation AT pengyuxing multistructurebasedcollaborativeonlinedistillation |