Cargando…

Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity

Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to...

Descripción completa

Detalles Bibliográficos
Autores principales: Gunna, Sanjana, Saluja, Rohit, Jawahar, Cheerakkuzhi Veluthemana
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9025185/
https://www.ncbi.nlm.nih.gov/pubmed/35448213
http://dx.doi.org/10.3390/jimaging8040086
_version_ 1784690806205972480
author Gunna, Sanjana
Saluja, Rohit
Jawahar, Cheerakkuzhi Veluthemana
author_facet Gunna, Sanjana
Saluja, Rohit
Jawahar, Cheerakkuzhi Veluthemana
author_sort Gunna, Sanjana
collection PubMed
description Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to ensure robust reading solutions. We present utilizing additional non-Unicode fonts with generally employed Unicode fonts to cover font diversity in such synthesizers for Indian languages. We also perform experiments on transfer learning among six different Indian languages. Our transfer learning experiments on synthetic images with common backgrounds provide an exciting insight that Indian scripts can benefit from each other than from the extensive English datasets. Our evaluations for the real settings help us achieve significant improvements over previous methods on four Indian languages from standard datasets like IIIT-ILST, MLT-17, and the new dataset (we release) containing 440 scene images with 500 Gujarati and 2535 Tamil words. Further enriching the synthetic dataset with non-Unicode fonts and multiple augmentations helps us achieve a remarkable Word Recognition Rate gain of over [Formula: see text] on the IIIT-ILST Hindi dataset. We also present the results of lexicon-based transcription approaches for all six languages.
format Online
Article
Text
id pubmed-9025185
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-90251852022-04-23 Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity Gunna, Sanjana Saluja, Rohit Jawahar, Cheerakkuzhi Veluthemana J Imaging Article Reading Indian scene texts is complex due to the use of regional vocabulary, multiple fonts/scripts, and text size. This work investigates the significant differences in Indian and Latin Scene Text Recognition (STR) systems. Recent STR works rely on synthetic generators that involve diverse fonts to ensure robust reading solutions. We present utilizing additional non-Unicode fonts with generally employed Unicode fonts to cover font diversity in such synthesizers for Indian languages. We also perform experiments on transfer learning among six different Indian languages. Our transfer learning experiments on synthetic images with common backgrounds provide an exciting insight that Indian scripts can benefit from each other than from the extensive English datasets. Our evaluations for the real settings help us achieve significant improvements over previous methods on four Indian languages from standard datasets like IIIT-ILST, MLT-17, and the new dataset (we release) containing 440 scene images with 500 Gujarati and 2535 Tamil words. Further enriching the synthetic dataset with non-Unicode fonts and multiple augmentations helps us achieve a remarkable Word Recognition Rate gain of over [Formula: see text] on the IIIT-ILST Hindi dataset. We also present the results of lexicon-based transcription approaches for all six languages. MDPI 2022-03-23 /pmc/articles/PMC9025185/ /pubmed/35448213 http://dx.doi.org/10.3390/jimaging8040086 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gunna, Sanjana
Saluja, Rohit
Jawahar, Cheerakkuzhi Veluthemana
Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_full Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_fullStr Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_full_unstemmed Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_short Improving Scene Text Recognition for Indian Languages with Transfer Learning and Font Diversity
title_sort improving scene text recognition for indian languages with transfer learning and font diversity
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9025185/
https://www.ncbi.nlm.nih.gov/pubmed/35448213
http://dx.doi.org/10.3390/jimaging8040086
work_keys_str_mv AT gunnasanjana improvingscenetextrecognitionforindianlanguageswithtransferlearningandfontdiversity
AT salujarohit improvingscenetextrecognitionforindianlanguageswithtransferlearningandfontdiversity
AT jawaharcheerakkuzhiveluthemana improvingscenetextrecognitionforindianlanguageswithtransferlearningandfontdiversity