Cargando…
Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer
Deep convolutional neural networks (CNNs) have achieved promising performance in the field of deep learning, but the manual design turns out to be very difficult due to the increasingly complex topologies of CNNs. Recently, neural architecture search (NAS) methods have been proposed to automatically...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516961/ https://www.ncbi.nlm.nih.gov/pubmed/37737271 http://dx.doi.org/10.1038/s41598-023-42931-3 |
_version_ | 1785109234677972992 |
---|---|
author | Xu, Yang Ma, Yongjie |
author_facet | Xu, Yang Ma, Yongjie |
author_sort | Xu, Yang |
collection | PubMed |
description | Deep convolutional neural networks (CNNs) have achieved promising performance in the field of deep learning, but the manual design turns out to be very difficult due to the increasingly complex topologies of CNNs. Recently, neural architecture search (NAS) methods have been proposed to automatically design network architectures, which are superior to handcrafted counterparts. Unfortunately, most current NAS methods suffer from either highly computational complexity of generated architectures or limitations in the flexibility of architecture design. To address above issues, this article proposes an evolutionary neural architecture search (ENAS) method based on improved Transformer and multi-branch ConvNet. The multi-branch block enriches the feature space and enhances the representational capacity of a network by combining paths with different complexities. Since convolution is inherently a local operation, a simple yet powerful “batch-free normalization Transformer Block” (BFNTBlock) is proposed to leverage both local information and long-range feature dependencies. In particular, the design of batch-free normalization (BFN) and batch normalization (BN) mixed in the BFNTBlock blocks the accumulation of estimation shift ascribe to the stack of BN, which has favorable effects for performance improvement. The proposed method achieves remarkable accuracies, 97.24 [Formula: see text] and 80.06 [Formula: see text] on CIFAR10 and CIFAR100, respectively, with high computational efficiency, i.e. only 1.46 and 1.53 GPU days. To validate the universality of our method in application scenarios, the proposed algorithm is verified on two real-world applications, including the GTSRB and NEU-CLS dataset, and achieves a better performance than common methods. |
format | Online Article Text |
id | pubmed-10516961 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-105169612023-09-24 Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer Xu, Yang Ma, Yongjie Sci Rep Article Deep convolutional neural networks (CNNs) have achieved promising performance in the field of deep learning, but the manual design turns out to be very difficult due to the increasingly complex topologies of CNNs. Recently, neural architecture search (NAS) methods have been proposed to automatically design network architectures, which are superior to handcrafted counterparts. Unfortunately, most current NAS methods suffer from either highly computational complexity of generated architectures or limitations in the flexibility of architecture design. To address above issues, this article proposes an evolutionary neural architecture search (ENAS) method based on improved Transformer and multi-branch ConvNet. The multi-branch block enriches the feature space and enhances the representational capacity of a network by combining paths with different complexities. Since convolution is inherently a local operation, a simple yet powerful “batch-free normalization Transformer Block” (BFNTBlock) is proposed to leverage both local information and long-range feature dependencies. In particular, the design of batch-free normalization (BFN) and batch normalization (BN) mixed in the BFNTBlock blocks the accumulation of estimation shift ascribe to the stack of BN, which has favorable effects for performance improvement. The proposed method achieves remarkable accuracies, 97.24 [Formula: see text] and 80.06 [Formula: see text] on CIFAR10 and CIFAR100, respectively, with high computational efficiency, i.e. only 1.46 and 1.53 GPU days. To validate the universality of our method in application scenarios, the proposed algorithm is verified on two real-world applications, including the GTSRB and NEU-CLS dataset, and achieves a better performance than common methods. Nature Publishing Group UK 2023-09-22 /pmc/articles/PMC10516961/ /pubmed/37737271 http://dx.doi.org/10.1038/s41598-023-42931-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Xu, Yang Ma, Yongjie Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer |
title | Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer |
title_full | Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer |
title_fullStr | Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer |
title_full_unstemmed | Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer |
title_short | Evolutionary neural architecture search combining multi-branch ConvNet and improved transformer |
title_sort | evolutionary neural architecture search combining multi-branch convnet and improved transformer |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10516961/ https://www.ncbi.nlm.nih.gov/pubmed/37737271 http://dx.doi.org/10.1038/s41598-023-42931-3 |
work_keys_str_mv | AT xuyang evolutionaryneuralarchitecturesearchcombiningmultibranchconvnetandimprovedtransformer AT mayongjie evolutionaryneuralarchitecturesearchcombiningmultibranchconvnetandimprovedtransformer |