Cargando…

S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition

The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in...

Descripción completa

Detalles Bibliográficos
Autores principales: Dan, Yongping, Zhu, Zongnan, Jin, Weishou, Li, Zhuo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575930/
https://www.ncbi.nlm.nih.gov/pubmed/36262120
http://dx.doi.org/10.7717/peerj-cs.1093
_version_ 1784811422021058560
author Dan, Yongping
Zhu, Zongnan
Jin, Weishou
Li, Zhuo
author_facet Dan, Yongping
Zhu, Zongnan
Jin, Weishou
Li, Zhuo
author_sort Dan, Yongping
collection PubMed
description The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in this article for handwritten Chinese character recognition. The model simplifies the initial four hierarchical stages into three hierarchical stages. In addition, the new model increases the size of the window in the window attention; the number of patches in the window is larger; and the perceptual field of the window is increased. As the network model deepens, the size of patches becomes larger, and the perceived range of each patch increases. Meanwhile, the purpose of shifting the window’s attention is to enhance the information interaction between the window and the window. Experimental results show that the verification accuracy improves slightly as the window becomes larger. The best validation accuracy of the simplified Swin Transformer model on the dataset reached 95.70%. The number of parameters is only 8.69 million, and FLOPs are 2.90G, which greatly reduces the number of parameters and computation of the model and proves the correctness and validity of the proposed model.
format Online
Article
Text
id pubmed-9575930
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-95759302022-10-18 S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition Dan, Yongping Zhu, Zongnan Jin, Weishou Li, Zhuo PeerJ Comput Sci Artificial Intelligence The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in this article for handwritten Chinese character recognition. The model simplifies the initial four hierarchical stages into three hierarchical stages. In addition, the new model increases the size of the window in the window attention; the number of patches in the window is larger; and the perceptual field of the window is increased. As the network model deepens, the size of patches becomes larger, and the perceived range of each patch increases. Meanwhile, the purpose of shifting the window’s attention is to enhance the information interaction between the window and the window. Experimental results show that the verification accuracy improves slightly as the window becomes larger. The best validation accuracy of the simplified Swin Transformer model on the dataset reached 95.70%. The number of parameters is only 8.69 million, and FLOPs are 2.90G, which greatly reduces the number of parameters and computation of the model and proves the correctness and validity of the proposed model. PeerJ Inc. 2022-09-20 /pmc/articles/PMC9575930/ /pubmed/36262120 http://dx.doi.org/10.7717/peerj-cs.1093 Text en ©2022 Dan et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Artificial Intelligence
Dan, Yongping
Zhu, Zongnan
Jin, Weishou
Li, Zhuo
S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
title S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
title_full S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
title_fullStr S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
title_full_unstemmed S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
title_short S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
title_sort s-swin transformer: simplified swin transformer model for offline handwritten chinese character recognition
topic Artificial Intelligence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575930/
https://www.ncbi.nlm.nih.gov/pubmed/36262120
http://dx.doi.org/10.7717/peerj-cs.1093
work_keys_str_mv AT danyongping sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition
AT zhuzongnan sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition
AT jinweishou sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition
AT lizhuo sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition