Cargando…

Knowledge distillation based on multi-layer fusion features

Knowledge distillation improves the performance of a small student network by promoting it to learn the knowledge from a pre-trained high-performance but bulky teacher network. Generally, most of the current knowledge distillation methods extract relatively simple features from the middle or bottom...

Descripción completa

Detalles Bibliográficos
Autores principales: Tan, Shengyuan, Guo, Rongzuo, Tang, Jialiang, Jiang, Ning, Zou, Junying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461825/
https://www.ncbi.nlm.nih.gov/pubmed/37639443
http://dx.doi.org/10.1371/journal.pone.0285901
_version_ 1785097917736943616
author Tan, Shengyuan
Guo, Rongzuo
Tang, Jialiang
Jiang, Ning
Zou, Junying
author_facet Tan, Shengyuan
Guo, Rongzuo
Tang, Jialiang
Jiang, Ning
Zou, Junying
author_sort Tan, Shengyuan
collection PubMed
description Knowledge distillation improves the performance of a small student network by promoting it to learn the knowledge from a pre-trained high-performance but bulky teacher network. Generally, most of the current knowledge distillation methods extract relatively simple features from the middle or bottom layer of teacher network for knowledge transmission. However, the above methods ignore the fusion of features, and the fused features contain richer information. We believe that the richer and better information contained in the knowledge delivered by teachers to students, the easier it is for students to perform better. In this paper, we propose a new method called Multi-feature Fusion Knowledge Distillation (MFKD) to extract and utilize the expressive fusion features of teacher network. Specifically, we extract feature maps from different positions of the network, i.e., the middle layer, the bottom layer, and even the front layer of the network. To properly utilize these features, this method designs a multi-feature fusion scheme to integrate them together. Compared to features extracted from single location of teacher network, the final fusion feature map contains meaningful information. Extensive experiments on image classification tasks demonstrate that the student network trained by our MFKD can learn from the fusion features, leading to superior performance. The results show that MFKD can improve the Top-1 accuracy of ResNet20 and VGG8 by 1.82% and 3.35% respectively on the CIFAR-100 dataset, which is better than state-of-the-art many existing methods.
format Online
Article
Text
id pubmed-10461825
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-104618252023-08-29 Knowledge distillation based on multi-layer fusion features Tan, Shengyuan Guo, Rongzuo Tang, Jialiang Jiang, Ning Zou, Junying PLoS One Research Article Knowledge distillation improves the performance of a small student network by promoting it to learn the knowledge from a pre-trained high-performance but bulky teacher network. Generally, most of the current knowledge distillation methods extract relatively simple features from the middle or bottom layer of teacher network for knowledge transmission. However, the above methods ignore the fusion of features, and the fused features contain richer information. We believe that the richer and better information contained in the knowledge delivered by teachers to students, the easier it is for students to perform better. In this paper, we propose a new method called Multi-feature Fusion Knowledge Distillation (MFKD) to extract and utilize the expressive fusion features of teacher network. Specifically, we extract feature maps from different positions of the network, i.e., the middle layer, the bottom layer, and even the front layer of the network. To properly utilize these features, this method designs a multi-feature fusion scheme to integrate them together. Compared to features extracted from single location of teacher network, the final fusion feature map contains meaningful information. Extensive experiments on image classification tasks demonstrate that the student network trained by our MFKD can learn from the fusion features, leading to superior performance. The results show that MFKD can improve the Top-1 accuracy of ResNet20 and VGG8 by 1.82% and 3.35% respectively on the CIFAR-100 dataset, which is better than state-of-the-art many existing methods. Public Library of Science 2023-08-28 /pmc/articles/PMC10461825/ /pubmed/37639443 http://dx.doi.org/10.1371/journal.pone.0285901 Text en © 2023 Tan et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tan, Shengyuan
Guo, Rongzuo
Tang, Jialiang
Jiang, Ning
Zou, Junying
Knowledge distillation based on multi-layer fusion features
title Knowledge distillation based on multi-layer fusion features
title_full Knowledge distillation based on multi-layer fusion features
title_fullStr Knowledge distillation based on multi-layer fusion features
title_full_unstemmed Knowledge distillation based on multi-layer fusion features
title_short Knowledge distillation based on multi-layer fusion features
title_sort knowledge distillation based on multi-layer fusion features
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10461825/
https://www.ncbi.nlm.nih.gov/pubmed/37639443
http://dx.doi.org/10.1371/journal.pone.0285901
work_keys_str_mv AT tanshengyuan knowledgedistillationbasedonmultilayerfusionfeatures
AT guorongzuo knowledgedistillationbasedonmultilayerfusionfeatures
AT tangjialiang knowledgedistillationbasedonmultilayerfusionfeatures
AT jiangning knowledgedistillationbasedonmultilayerfusionfeatures
AT zoujunying knowledgedistillationbasedonmultilayerfusionfeatures