Cargando…

A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions

In this study, we propose a new model for optical character recognition (OCR) based on both CNNs (convolutional neural networks) and RNNs (recurrent neural networks). The distortions affecting the document image can take different forms, such as blur (focus blur, motion blur, etc.), shadow, bad cont...

Descripción completa

Detalles Bibliográficos
Autores principales: Mohsenzadegan, Kabeh, Tavakkoli, Vahid, Kyamakya, Kyandoghere
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414947/
https://www.ncbi.nlm.nih.gov/pubmed/36015785
http://dx.doi.org/10.3390/s22166025
_version_ 1784776111621668864
author Mohsenzadegan, Kabeh
Tavakkoli, Vahid
Kyamakya, Kyandoghere
author_facet Mohsenzadegan, Kabeh
Tavakkoli, Vahid
Kyamakya, Kyandoghere
author_sort Mohsenzadegan, Kabeh
collection PubMed
description In this study, we propose a new model for optical character recognition (OCR) based on both CNNs (convolutional neural networks) and RNNs (recurrent neural networks). The distortions affecting the document image can take different forms, such as blur (focus blur, motion blur, etc.), shadow, bad contrast, etc. Document-image distortions significantly decrease the performance of OCR systems, to the extent that they reach a performance close to zero. Therefore, a robust OCR model that performs robustly even under hard (distortion) conditions is still sorely needed. However, our comprehensive study in this paper shows that various related works can somewhat improve their respective OCR recognition performance of degraded document images (e.g., captured by smartphone cameras under different conditions and, thus, distorted by shadows, contrast, blur, etc.), but it is worth underscoring, that improved recognition is neither sufficient nor always satisfactory—especially in very harsh conditions. Therefore, in this paper, we suggest and develop a much better and fully different approach and model architecture, which significantly outperforms the aforementioned previous related works. Furthermore, a new dataset was gathered to show a series of different and well-representative real-world scenarios of hard distortion conditions. The new OCR model suggested performs in such a way that even document images (even from the hardest conditions) that were previously not recognizable by other OCR systems can be fully recognized with up to 97.5% accuracy/precision by our new deep-learning-based OCR model.
format Online
Article
Text
id pubmed-9414947
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94149472022-08-27 A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions Mohsenzadegan, Kabeh Tavakkoli, Vahid Kyamakya, Kyandoghere Sensors (Basel) Article In this study, we propose a new model for optical character recognition (OCR) based on both CNNs (convolutional neural networks) and RNNs (recurrent neural networks). The distortions affecting the document image can take different forms, such as blur (focus blur, motion blur, etc.), shadow, bad contrast, etc. Document-image distortions significantly decrease the performance of OCR systems, to the extent that they reach a performance close to zero. Therefore, a robust OCR model that performs robustly even under hard (distortion) conditions is still sorely needed. However, our comprehensive study in this paper shows that various related works can somewhat improve their respective OCR recognition performance of degraded document images (e.g., captured by smartphone cameras under different conditions and, thus, distorted by shadows, contrast, blur, etc.), but it is worth underscoring, that improved recognition is neither sufficient nor always satisfactory—especially in very harsh conditions. Therefore, in this paper, we suggest and develop a much better and fully different approach and model architecture, which significantly outperforms the aforementioned previous related works. Furthermore, a new dataset was gathered to show a series of different and well-representative real-world scenarios of hard distortion conditions. The new OCR model suggested performs in such a way that even document images (even from the hardest conditions) that were previously not recognizable by other OCR systems can be fully recognized with up to 97.5% accuracy/precision by our new deep-learning-based OCR model. MDPI 2022-08-12 /pmc/articles/PMC9414947/ /pubmed/36015785 http://dx.doi.org/10.3390/s22166025 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Mohsenzadegan, Kabeh
Tavakkoli, Vahid
Kyamakya, Kyandoghere
A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions
title A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions
title_full A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions
title_fullStr A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions
title_full_unstemmed A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions
title_short A Smart Visual Sensing Concept Involving Deep Learning for a Robust Optical Character Recognition under Hard Real-World Conditions
title_sort smart visual sensing concept involving deep learning for a robust optical character recognition under hard real-world conditions
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414947/
https://www.ncbi.nlm.nih.gov/pubmed/36015785
http://dx.doi.org/10.3390/s22166025
work_keys_str_mv AT mohsenzadegankabeh asmartvisualsensingconceptinvolvingdeeplearningforarobustopticalcharacterrecognitionunderhardrealworldconditions
AT tavakkolivahid asmartvisualsensingconceptinvolvingdeeplearningforarobustopticalcharacterrecognitionunderhardrealworldconditions
AT kyamakyakyandoghere asmartvisualsensingconceptinvolvingdeeplearningforarobustopticalcharacterrecognitionunderhardrealworldconditions
AT mohsenzadegankabeh smartvisualsensingconceptinvolvingdeeplearningforarobustopticalcharacterrecognitionunderhardrealworldconditions
AT tavakkolivahid smartvisualsensingconceptinvolvingdeeplearningforarobustopticalcharacterrecognitionunderhardrealworldconditions
AT kyamakyakyandoghere smartvisualsensingconceptinvolvingdeeplearningforarobustopticalcharacterrecognitionunderhardrealworldconditions