Cargando…

SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired

For visually impaired people (VIPs), the ability to convert text to sound can mean a new level of independence or the simple joy of a good book. With significant advances in optical character recognition (OCR) in recent years, a number of reading aids are appearing on the market. These reading aids...

Descripción completa

Detalles Bibliográficos
Autor principal:	Courtney, Jane
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8470036/ https://www.ncbi.nlm.nih.gov/pubmed/34460804 http://dx.doi.org/10.3390/jimaging7090168

_version_	1784574095369699328
author	Courtney, Jane
author_facet	Courtney, Jane
author_sort	Courtney, Jane
collection	PubMed
description	For visually impaired people (VIPs), the ability to convert text to sound can mean a new level of independence or the simple joy of a good book. With significant advances in optical character recognition (OCR) in recent years, a number of reading aids are appearing on the market. These reading aids convert images captured by a camera to text which can then be read aloud. However, all of these reading aids suffer from a key issue—the user must be able to visually target the text and capture an image of sufficient quality for the OCR algorithm to function—no small task for VIPs. In this work, a sound-emitting document image quality assessment metric (SEDIQA) is proposed which allows the user to hear the quality of the text image and automatically captures the best image for OCR accuracy. This work also includes testing of OCR performance against image degradations, to identify the most significant contributors to accuracy reduction. The proposed no-reference image quality assessor (NR-IQA) is validated alongside established NR-IQAs and this work includes insights into the performance of these NR-IQAs on document images. SEDIQA is found to consistently select the best image for OCR accuracy. The full system includes a document image enhancement technique which introduces improvements in OCR accuracy with an average increase of 22% and a maximum increase of 68%.
format	Online Article Text
id	pubmed-8470036
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-84700362021-10-28 SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired Courtney, Jane J Imaging Article For visually impaired people (VIPs), the ability to convert text to sound can mean a new level of independence or the simple joy of a good book. With significant advances in optical character recognition (OCR) in recent years, a number of reading aids are appearing on the market. These reading aids convert images captured by a camera to text which can then be read aloud. However, all of these reading aids suffer from a key issue—the user must be able to visually target the text and capture an image of sufficient quality for the OCR algorithm to function—no small task for VIPs. In this work, a sound-emitting document image quality assessment metric (SEDIQA) is proposed which allows the user to hear the quality of the text image and automatically captures the best image for OCR accuracy. This work also includes testing of OCR performance against image degradations, to identify the most significant contributors to accuracy reduction. The proposed no-reference image quality assessor (NR-IQA) is validated alongside established NR-IQAs and this work includes insights into the performance of these NR-IQAs on document images. SEDIQA is found to consistently select the best image for OCR accuracy. The full system includes a document image enhancement technique which introduces improvements in OCR accuracy with an average increase of 22% and a maximum increase of 68%. MDPI 2021-08-30 /pmc/articles/PMC8470036/ /pubmed/34460804 http://dx.doi.org/10.3390/jimaging7090168 Text en © 2021 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Courtney, Jane SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired
title	SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired
title_full	SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired
title_fullStr	SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired
title_full_unstemmed	SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired
title_short	SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired
title_sort	sediqa: sound emitting document image quality assessment in a reading aid for the visually impaired
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8470036/ https://www.ncbi.nlm.nih.gov/pubmed/34460804 http://dx.doi.org/10.3390/jimaging7090168
work_keys_str_mv	AT courtneyjane sediqasoundemittingdocumentimagequalityassessmentinareadingaidforthevisuallyimpaired

SEDIQA: Sound Emitting Document Image Quality Assessment in a Reading Aid for the Visually Impaired

Ejemplares similares