Cargando…

Images with hidden information data set for information retrieval usage

The main task in Optical Character Recognition (OCR) is to get and convert all the text characters on an image as a plain text data. However, if the image has low contrast and low exposure, an issue may occur. The characters may be hidden and can't be recovered completely. One solution that has...

Descripción completa

Detalles Bibliográficos
Autores principales: Pangestu, Peter, Gunawan, Dennis, Hansun, Seng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6727008/
https://www.ncbi.nlm.nih.gov/pubmed/31508469
http://dx.doi.org/10.1016/j.dib.2019.104397
_version_ 1783449183474155520
author Pangestu, Peter
Gunawan, Dennis
Hansun, Seng
author_facet Pangestu, Peter
Gunawan, Dennis
Hansun, Seng
author_sort Pangestu, Peter
collection PubMed
description The main task in Optical Character Recognition (OCR) is to get and convert all the text characters on an image as a plain text data. However, if the image has low contrast and low exposure, an issue may occur. The characters may be hidden and can't be recovered completely. One solution that has been done and reported in 2017 is by applying histogram equalization as a pre-processing step in OCR. Here, we deliver a total of 30 sample data, some of which had been used on the research's experiment reported in 2017, and some others were added later.
format Online
Article
Text
id pubmed-6727008
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-67270082019-09-10 Images with hidden information data set for information retrieval usage Pangestu, Peter Gunawan, Dennis Hansun, Seng Data Brief Computer Science The main task in Optical Character Recognition (OCR) is to get and convert all the text characters on an image as a plain text data. However, if the image has low contrast and low exposure, an issue may occur. The characters may be hidden and can't be recovered completely. One solution that has been done and reported in 2017 is by applying histogram equalization as a pre-processing step in OCR. Here, we deliver a total of 30 sample data, some of which had been used on the research's experiment reported in 2017, and some others were added later. Elsevier 2019-08-16 /pmc/articles/PMC6727008/ /pubmed/31508469 http://dx.doi.org/10.1016/j.dib.2019.104397 Text en © 2019 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Computer Science
Pangestu, Peter
Gunawan, Dennis
Hansun, Seng
Images with hidden information data set for information retrieval usage
title Images with hidden information data set for information retrieval usage
title_full Images with hidden information data set for information retrieval usage
title_fullStr Images with hidden information data set for information retrieval usage
title_full_unstemmed Images with hidden information data set for information retrieval usage
title_short Images with hidden information data set for information retrieval usage
title_sort images with hidden information data set for information retrieval usage
topic Computer Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6727008/
https://www.ncbi.nlm.nih.gov/pubmed/31508469
http://dx.doi.org/10.1016/j.dib.2019.104397
work_keys_str_mv AT pangestupeter imageswithhiddeninformationdatasetforinformationretrievalusage
AT gunawandennis imageswithhiddeninformationdatasetforinformationretrievalusage
AT hansunseng imageswithhiddeninformationdatasetforinformationretrievalusage