Cargando…

An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory

Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The C...

Descripción completa

Detalles Bibliográficos
Autores principales: Meng, Pengfei, Jia, Shuangcheng, Li, Qian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8626447/
https://www.ncbi.nlm.nih.gov/pubmed/34836997
http://dx.doi.org/10.1038/s41598-021-01520-y
_version_ 1784606658611118080
author Meng, Pengfei
Jia, Shuangcheng
Li, Qian
author_facet Meng, Pengfei
Jia, Shuangcheng
Li, Qian
author_sort Meng, Pengfei
collection PubMed
description Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The CRNN is less effective in recognizing long dense small characters. Aiming at the shortcomings of CRNN, we proposed an improved CRNN network, named CRNN-RES, based on BiLSTM and multiple receptive fields. Specifically, on the one hand, the CRNN-RES uses a dual pooling core to enhance the CNN network’s ability to extract features. On the other hand, by improving the last RNN layer, the BiLSTM is changed to a shared parameter BiLSTM network using recursive residuals, which reduces the number of network parameters and improves the accuracy. In addition, we designed a structure that can flexibly configure the length of the input data sequence in the RNN layer, called the CRFC layer. Comparing the CRNN-RES network proposed in this paper with the original CRNN network, the extensive experiments show that when recognizing English characters and numbers, the parameters of CRNN-RES is 8197549, which decreased 133,752 parameters compare with CRNN. In the public dataset ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT), the CRNN-RES obtain the accuracy of 96.90%, 89.85%, 83.63%, and 82.96%, which higher than CRNN by 1.40%, 3.15%, 5.43%, and 2.16% respectively.
format Online
Article
Text
id pubmed-8626447
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-86264472021-11-29 An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory Meng, Pengfei Jia, Shuangcheng Li, Qian Sci Rep Article Sequence recognition of natural scene images has always been an important research topic in the field of computer vision. CRNN has been proven to be a popular end-to-end character sequence recognition network. However, the problem of wide characters is not considered under the setting of CRNN. The CRNN is less effective in recognizing long dense small characters. Aiming at the shortcomings of CRNN, we proposed an improved CRNN network, named CRNN-RES, based on BiLSTM and multiple receptive fields. Specifically, on the one hand, the CRNN-RES uses a dual pooling core to enhance the CNN network’s ability to extract features. On the other hand, by improving the last RNN layer, the BiLSTM is changed to a shared parameter BiLSTM network using recursive residuals, which reduces the number of network parameters and improves the accuracy. In addition, we designed a structure that can flexibly configure the length of the input data sequence in the RNN layer, called the CRFC layer. Comparing the CRNN-RES network proposed in this paper with the original CRNN network, the extensive experiments show that when recognizing English characters and numbers, the parameters of CRNN-RES is 8197549, which decreased 133,752 parameters compare with CRNN. In the public dataset ICDAR 2003 (IC03), ICDAR 2013 (IC13), IIIT 5k-word (IIIT5k), and Street View Text (SVT), the CRNN-RES obtain the accuracy of 96.90%, 89.85%, 83.63%, and 82.96%, which higher than CRNN by 1.40%, 3.15%, 5.43%, and 2.16% respectively. Nature Publishing Group UK 2021-11-26 /pmc/articles/PMC8626447/ /pubmed/34836997 http://dx.doi.org/10.1038/s41598-021-01520-y Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Meng, Pengfei
Jia, Shuangcheng
Li, Qian
An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_full An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_fullStr An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_full_unstemmed An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_short An innovative network based on double receptive field and Recursive Bi-directional Long Short-Term Memory
title_sort innovative network based on double receptive field and recursive bi-directional long short-term memory
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8626447/
https://www.ncbi.nlm.nih.gov/pubmed/34836997
http://dx.doi.org/10.1038/s41598-021-01520-y
work_keys_str_mv AT mengpengfei aninnovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT jiashuangcheng aninnovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT liqian aninnovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT mengpengfei innovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT jiashuangcheng innovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory
AT liqian innovativenetworkbasedondoublereceptivefieldandrecursivebidirectionallongshorttermmemory