Cargando…
S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition
The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
PeerJ Inc.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575930/ https://www.ncbi.nlm.nih.gov/pubmed/36262120 http://dx.doi.org/10.7717/peerj-cs.1093 |
_version_ | 1784811422021058560 |
---|---|
author | Dan, Yongping Zhu, Zongnan Jin, Weishou Li, Zhuo |
author_facet | Dan, Yongping Zhu, Zongnan Jin, Weishou Li, Zhuo |
author_sort | Dan, Yongping |
collection | PubMed |
description | The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in this article for handwritten Chinese character recognition. The model simplifies the initial four hierarchical stages into three hierarchical stages. In addition, the new model increases the size of the window in the window attention; the number of patches in the window is larger; and the perceptual field of the window is increased. As the network model deepens, the size of patches becomes larger, and the perceived range of each patch increases. Meanwhile, the purpose of shifting the window’s attention is to enhance the information interaction between the window and the window. Experimental results show that the verification accuracy improves slightly as the window becomes larger. The best validation accuracy of the simplified Swin Transformer model on the dataset reached 95.70%. The number of parameters is only 8.69 million, and FLOPs are 2.90G, which greatly reduces the number of parameters and computation of the model and proves the correctness and validity of the proposed model. |
format | Online Article Text |
id | pubmed-9575930 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | PeerJ Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-95759302022-10-18 S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition Dan, Yongping Zhu, Zongnan Jin, Weishou Li, Zhuo PeerJ Comput Sci Artificial Intelligence The Transformer shows good prospects in computer vision. However, the Swin Transformer model has the disadvantage of a large number of parameters and high computational effort. To effectively solve these problems of the model, a simplified Swin Transformer (S-Swin Transformer) model was proposed in this article for handwritten Chinese character recognition. The model simplifies the initial four hierarchical stages into three hierarchical stages. In addition, the new model increases the size of the window in the window attention; the number of patches in the window is larger; and the perceptual field of the window is increased. As the network model deepens, the size of patches becomes larger, and the perceived range of each patch increases. Meanwhile, the purpose of shifting the window’s attention is to enhance the information interaction between the window and the window. Experimental results show that the verification accuracy improves slightly as the window becomes larger. The best validation accuracy of the simplified Swin Transformer model on the dataset reached 95.70%. The number of parameters is only 8.69 million, and FLOPs are 2.90G, which greatly reduces the number of parameters and computation of the model and proves the correctness and validity of the proposed model. PeerJ Inc. 2022-09-20 /pmc/articles/PMC9575930/ /pubmed/36262120 http://dx.doi.org/10.7717/peerj-cs.1093 Text en ©2022 Dan et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited. |
spellingShingle | Artificial Intelligence Dan, Yongping Zhu, Zongnan Jin, Weishou Li, Zhuo S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition |
title | S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition |
title_full | S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition |
title_fullStr | S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition |
title_full_unstemmed | S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition |
title_short | S-Swin Transformer: simplified Swin Transformer model for offline handwritten Chinese character recognition |
title_sort | s-swin transformer: simplified swin transformer model for offline handwritten chinese character recognition |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9575930/ https://www.ncbi.nlm.nih.gov/pubmed/36262120 http://dx.doi.org/10.7717/peerj-cs.1093 |
work_keys_str_mv | AT danyongping sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition AT zhuzongnan sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition AT jinweishou sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition AT lizhuo sswintransformersimplifiedswintransformermodelforofflinehandwrittenchinesecharacterrecognition |