Cargando…

Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures

Deep learning has proven to be an important element of modern data processing technology, which has found its application in many areas such as multimodal sensor data processing and understanding, data generation and anomaly detection. While the use of deep learning is booming in many real-world tas...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shi, Xiyu, De-Silva, Varuna, Aslan, Yusuf, Ekmekcioglu, Erhan, Kondoz, Ahmet
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774447/ https://www.ncbi.nlm.nih.gov/pubmed/35052093 http://dx.doi.org/10.3390/e24010067

_version_	1784636344304140288
author	Shi, Xiyu De-Silva, Varuna Aslan, Yusuf Ekmekcioglu, Erhan Kondoz, Ahmet
author_facet	Shi, Xiyu De-Silva, Varuna Aslan, Yusuf Ekmekcioglu, Erhan Kondoz, Ahmet
author_sort	Shi, Xiyu
collection	PubMed
description	Deep learning has proven to be an important element of modern data processing technology, which has found its application in many areas such as multimodal sensor data processing and understanding, data generation and anomaly detection. While the use of deep learning is booming in many real-world tasks, the internal processes of how it draws results is still uncertain. Understanding the data processing pathways within a deep neural network is important for transparency and better resource utilisation. In this paper, a method utilising information theoretic measures is used to reveal the typical learning patterns of convolutional neural networks, which are commonly used for image processing tasks. For this purpose, training samples, true labels and estimated labels are considered to be random variables. The mutual information and conditional entropy between these variables are then studied using information theoretical measures. This paper shows that more convolutional layers in the network improve its learning and unnecessarily higher numbers of convolutional layers do not improve the learning any further. The number of convolutional layers that need to be added to a neural network to gain the desired learning level can be determined with the help of theoretic information quantities including entropy, inequality and mutual information among the inputs to the network. The kernel size of convolutional layers only affects the learning speed of the network. This study also shows that where the dropout layer is applied to has no significant effects on the learning of networks with a lower dropout rate, and it is better placed immediately after the last convolutional layer with higher dropout rates.
format	Online Article Text
id	pubmed-8774447
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-87744472022-01-21 Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures Shi, Xiyu De-Silva, Varuna Aslan, Yusuf Ekmekcioglu, Erhan Kondoz, Ahmet Entropy (Basel) Article Deep learning has proven to be an important element of modern data processing technology, which has found its application in many areas such as multimodal sensor data processing and understanding, data generation and anomaly detection. While the use of deep learning is booming in many real-world tasks, the internal processes of how it draws results is still uncertain. Understanding the data processing pathways within a deep neural network is important for transparency and better resource utilisation. In this paper, a method utilising information theoretic measures is used to reveal the typical learning patterns of convolutional neural networks, which are commonly used for image processing tasks. For this purpose, training samples, true labels and estimated labels are considered to be random variables. The mutual information and conditional entropy between these variables are then studied using information theoretical measures. This paper shows that more convolutional layers in the network improve its learning and unnecessarily higher numbers of convolutional layers do not improve the learning any further. The number of convolutional layers that need to be added to a neural network to gain the desired learning level can be determined with the help of theoretic information quantities including entropy, inequality and mutual information among the inputs to the network. The kernel size of convolutional layers only affects the learning speed of the network. This study also shows that where the dropout layer is applied to has no significant effects on the learning of networks with a lower dropout rate, and it is better placed immediately after the last convolutional layer with higher dropout rates. MDPI 2021-12-30 /pmc/articles/PMC8774447/ /pubmed/35052093 http://dx.doi.org/10.3390/e24010067 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Shi, Xiyu De-Silva, Varuna Aslan, Yusuf Ekmekcioglu, Erhan Kondoz, Ahmet Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures
title	Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures
title_full	Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures
title_fullStr	Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures
title_full_unstemmed	Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures
title_short	Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures
title_sort	evaluating the learning procedure of cnns through a sequence of prognostic tests utilising information theoretical measures
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8774447/ https://www.ncbi.nlm.nih.gov/pubmed/35052093 http://dx.doi.org/10.3390/e24010067
work_keys_str_mv	AT shixiyu evaluatingthelearningprocedureofcnnsthroughasequenceofprognostictestsutilisinginformationtheoreticalmeasures AT desilvavaruna evaluatingthelearningprocedureofcnnsthroughasequenceofprognostictestsutilisinginformationtheoreticalmeasures AT aslanyusuf evaluatingthelearningprocedureofcnnsthroughasequenceofprognostictestsutilisinginformationtheoreticalmeasures AT ekmekciogluerhan evaluatingthelearningprocedureofcnnsthroughasequenceofprognostictestsutilisinginformationtheoreticalmeasures AT kondozahmet evaluatingthelearningprocedureofcnnsthroughasequenceofprognostictestsutilisinginformationtheoreticalmeasures

Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures

Ejemplares similares