Cargando…

Adaptive Modular Convolutional Neural Network for Image Recognition

Image recognition has long been one of the research hotspots in computer vision tasks. The development of deep learning is rapid in recent years, and convolutional neural networks usually need to be designed with fixed resources. If sufficient resources are available, the model can be scaled up to a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wu, Wenbo, Pan, Yun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Communication
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9330193/ https://www.ncbi.nlm.nih.gov/pubmed/35897991 http://dx.doi.org/10.3390/s22155488

_version_	1784758102994714624
author	Wu, Wenbo Pan, Yun
author_facet	Wu, Wenbo Pan, Yun
author_sort	Wu, Wenbo
collection	PubMed
description	Image recognition has long been one of the research hotspots in computer vision tasks. The development of deep learning is rapid in recent years, and convolutional neural networks usually need to be designed with fixed resources. If sufficient resources are available, the model can be scaled up to achieve higher accuracy, for example, VggNet, ResNet, GoogLeNet, etc. Although the accuracy of large-scale models has been improved, the following problems will occur with the expansion of model scale: (1) There may be over-fitting; (2) increasing model parameters; (3) slow model convergence. This paper proposes a design method for a modular convolutional neural network model which solves the problem of over-fitting and large model parameters by connecting multiple modules in parallel. Moreover, each module contains several submodules (three submodules in this paper) and fuses the features extracted from the submodules. The model convergence can be accelerated by using the fused features (the fused features contain more image information). In this study, we add a gate unit based on the attention mechanism to the model, which aims to optimize the structure of the model (select the optimal number of modules), allowing the model to select an optimum network structure by learning and dynamically reducing FLOPs (floating-point operations per second) of the model. Compared to VggNet, ResNet, and GoogLeNet, the structure of the model proposed in this paper is simple and the parameters are small. The proposed model achieves good results in the Kaggle datasets Cats-vs.-Dogs (99.3%), 10-Monkey Species (99.26%), and Birds-400 (99.13%).
format	Online Article Text
id	pubmed-9330193
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-93301932022-07-29 Adaptive Modular Convolutional Neural Network for Image Recognition Wu, Wenbo Pan, Yun Sensors (Basel) Communication Image recognition has long been one of the research hotspots in computer vision tasks. The development of deep learning is rapid in recent years, and convolutional neural networks usually need to be designed with fixed resources. If sufficient resources are available, the model can be scaled up to achieve higher accuracy, for example, VggNet, ResNet, GoogLeNet, etc. Although the accuracy of large-scale models has been improved, the following problems will occur with the expansion of model scale: (1) There may be over-fitting; (2) increasing model parameters; (3) slow model convergence. This paper proposes a design method for a modular convolutional neural network model which solves the problem of over-fitting and large model parameters by connecting multiple modules in parallel. Moreover, each module contains several submodules (three submodules in this paper) and fuses the features extracted from the submodules. The model convergence can be accelerated by using the fused features (the fused features contain more image information). In this study, we add a gate unit based on the attention mechanism to the model, which aims to optimize the structure of the model (select the optimal number of modules), allowing the model to select an optimum network structure by learning and dynamically reducing FLOPs (floating-point operations per second) of the model. Compared to VggNet, ResNet, and GoogLeNet, the structure of the model proposed in this paper is simple and the parameters are small. The proposed model achieves good results in the Kaggle datasets Cats-vs.-Dogs (99.3%), 10-Monkey Species (99.26%), and Birds-400 (99.13%). MDPI 2022-07-22 /pmc/articles/PMC9330193/ /pubmed/35897991 http://dx.doi.org/10.3390/s22155488 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Communication Wu, Wenbo Pan, Yun Adaptive Modular Convolutional Neural Network for Image Recognition
title	Adaptive Modular Convolutional Neural Network for Image Recognition
title_full	Adaptive Modular Convolutional Neural Network for Image Recognition
title_fullStr	Adaptive Modular Convolutional Neural Network for Image Recognition
title_full_unstemmed	Adaptive Modular Convolutional Neural Network for Image Recognition
title_short	Adaptive Modular Convolutional Neural Network for Image Recognition
title_sort	adaptive modular convolutional neural network for image recognition
topic	Communication
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9330193/ https://www.ncbi.nlm.nih.gov/pubmed/35897991 http://dx.doi.org/10.3390/s22155488
work_keys_str_mv	AT wuwenbo adaptivemodularconvolutionalneuralnetworkforimagerecognition AT panyun adaptivemodularconvolutionalneuralnetworkforimagerecognition

Adaptive Modular Convolutional Neural Network for Image Recognition

Ejemplares similares