Cargando…

Compression Helps Deep Learning in Image Classification

The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, En-Hui, Amer, Hossam, Jiang, Yanbing
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8304067/ https://www.ncbi.nlm.nih.gov/pubmed/34356422 http://dx.doi.org/10.3390/e23070881

_version_	1783727244017926144
author	Yang, En-Hui Amer, Hossam Jiang, Yanbing
author_facet	Yang, En-Hui Amer, Hossam Jiang, Yanbing
author_sort	Yang, En-Hui
collection	PubMed
description	The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including its original version, a suitable version as an input to the underlying DNN, then the classification accuracy of the underlying DNN can be improved significantly while the size in bits of the selected input is, on average, reduced dramatically in comparison with the original image. This is in contrast to the conventional understanding that JPEG compression generally degrades the classification accuracy of DL. Specifically, for each original image, consider its 10 JPEG compressed versions with their quality factor (QF) values from [Formula: see text]. Under the assumption that the ground truth label of the original image is known at the time of selecting an input, but unknown to the underlying DNN, we present a selector called Highest Rank Selector (HRS). It is shown that HRS is optimal in the sense of achieving the highest Top k accuracy on any set of images for any k among all possible selectors. When the underlying DNN is Inception V3 or ResNet-50 V2, HRS improves, on average, the Top 1 classification accuracy and Top 5 classification accuracy on the whole ImageNet validation dataset by 5.6% and 1.9%, respectively, while reducing the input size in bits dramatically—the compression ratio (CR) between the size of the original images and the size of the selected input images by HRS is 8 for the whole ImageNet validation dataset. When the ground truth label of the original image is unknown at the time of selection, we further propose a new convolutional neural network (CNN) topology which is based on the underlying DNN and takes the original image and its 10 JPEG compressed versions as 11 parallel inputs. It is demonstrated that the proposed new CNN topology, even when partially trained, can consistently improve the Top 1 accuracy of Inception V3 and ResNet-50 V2 by approximately 0.4% and the Top 5 accuracy of Inception V3 and ResNet-50 V2 by 0.32% and 0.2%, respectively. Other selectors without the knowledge of the ground truth label of the original image are also presented. They maintain the Top 1 accuracy, the Top 5 accuracy, or the Top 1 and Top 5 accuracy of the underlying DNN, while achieving CRs of 8.8, 3.3, and 3.1, respectively.
format	Online Article Text
id	pubmed-8304067
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-83040672021-07-25 Compression Helps Deep Learning in Image Classification Yang, En-Hui Amer, Hossam Jiang, Yanbing Entropy (Basel) Article The impact of JPEG compression on deep learning (DL) in image classification is revisited. Given an underlying deep neural network (DNN) pre-trained with pristine ImageNet images, it is demonstrated that, if, for any original image, one can select, among its many JPEG compressed versions including its original version, a suitable version as an input to the underlying DNN, then the classification accuracy of the underlying DNN can be improved significantly while the size in bits of the selected input is, on average, reduced dramatically in comparison with the original image. This is in contrast to the conventional understanding that JPEG compression generally degrades the classification accuracy of DL. Specifically, for each original image, consider its 10 JPEG compressed versions with their quality factor (QF) values from [Formula: see text]. Under the assumption that the ground truth label of the original image is known at the time of selecting an input, but unknown to the underlying DNN, we present a selector called Highest Rank Selector (HRS). It is shown that HRS is optimal in the sense of achieving the highest Top k accuracy on any set of images for any k among all possible selectors. When the underlying DNN is Inception V3 or ResNet-50 V2, HRS improves, on average, the Top 1 classification accuracy and Top 5 classification accuracy on the whole ImageNet validation dataset by 5.6% and 1.9%, respectively, while reducing the input size in bits dramatically—the compression ratio (CR) between the size of the original images and the size of the selected input images by HRS is 8 for the whole ImageNet validation dataset. When the ground truth label of the original image is unknown at the time of selection, we further propose a new convolutional neural network (CNN) topology which is based on the underlying DNN and takes the original image and its 10 JPEG compressed versions as 11 parallel inputs. It is demonstrated that the proposed new CNN topology, even when partially trained, can consistently improve the Top 1 accuracy of Inception V3 and ResNet-50 V2 by approximately 0.4% and the Top 5 accuracy of Inception V3 and ResNet-50 V2 by 0.32% and 0.2%, respectively. Other selectors without the knowledge of the ground truth label of the original image are also presented. They maintain the Top 1 accuracy, the Top 5 accuracy, or the Top 1 and Top 5 accuracy of the underlying DNN, while achieving CRs of 8.8, 3.3, and 3.1, respectively. MDPI 2021-07-10 /pmc/articles/PMC8304067/ /pubmed/34356422 http://dx.doi.org/10.3390/e23070881 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Yang, En-Hui Amer, Hossam Jiang, Yanbing Compression Helps Deep Learning in Image Classification
title	Compression Helps Deep Learning in Image Classification
title_full	Compression Helps Deep Learning in Image Classification
title_fullStr	Compression Helps Deep Learning in Image Classification
title_full_unstemmed	Compression Helps Deep Learning in Image Classification
title_short	Compression Helps Deep Learning in Image Classification
title_sort	compression helps deep learning in image classification
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8304067/ https://www.ncbi.nlm.nih.gov/pubmed/34356422 http://dx.doi.org/10.3390/e23070881
work_keys_str_mv	AT yangenhui compressionhelpsdeeplearninginimageclassification AT amerhossam compressionhelpsdeeplearninginimageclassification AT jiangyanbing compressionhelpsdeeplearninginimageclassification

Compression Helps Deep Learning in Image Classification

Ejemplares similares