Cargando…
Images with hidden information data set for information retrieval usage
The main task in Optical Character Recognition (OCR) is to get and convert all the text characters on an image as a plain text data. However, if the image has low contrast and low exposure, an issue may occur. The characters may be hidden and can't be recovered completely. One solution that has...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6727008/ https://www.ncbi.nlm.nih.gov/pubmed/31508469 http://dx.doi.org/10.1016/j.dib.2019.104397 |
_version_ | 1783449183474155520 |
---|---|
author | Pangestu, Peter Gunawan, Dennis Hansun, Seng |
author_facet | Pangestu, Peter Gunawan, Dennis Hansun, Seng |
author_sort | Pangestu, Peter |
collection | PubMed |
description | The main task in Optical Character Recognition (OCR) is to get and convert all the text characters on an image as a plain text data. However, if the image has low contrast and low exposure, an issue may occur. The characters may be hidden and can't be recovered completely. One solution that has been done and reported in 2017 is by applying histogram equalization as a pre-processing step in OCR. Here, we deliver a total of 30 sample data, some of which had been used on the research's experiment reported in 2017, and some others were added later. |
format | Online Article Text |
id | pubmed-6727008 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-67270082019-09-10 Images with hidden information data set for information retrieval usage Pangestu, Peter Gunawan, Dennis Hansun, Seng Data Brief Computer Science The main task in Optical Character Recognition (OCR) is to get and convert all the text characters on an image as a plain text data. However, if the image has low contrast and low exposure, an issue may occur. The characters may be hidden and can't be recovered completely. One solution that has been done and reported in 2017 is by applying histogram equalization as a pre-processing step in OCR. Here, we deliver a total of 30 sample data, some of which had been used on the research's experiment reported in 2017, and some others were added later. Elsevier 2019-08-16 /pmc/articles/PMC6727008/ /pubmed/31508469 http://dx.doi.org/10.1016/j.dib.2019.104397 Text en © 2019 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Computer Science Pangestu, Peter Gunawan, Dennis Hansun, Seng Images with hidden information data set for information retrieval usage |
title | Images with hidden information data set for information retrieval usage |
title_full | Images with hidden information data set for information retrieval usage |
title_fullStr | Images with hidden information data set for information retrieval usage |
title_full_unstemmed | Images with hidden information data set for information retrieval usage |
title_short | Images with hidden information data set for information retrieval usage |
title_sort | images with hidden information data set for information retrieval usage |
topic | Computer Science |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6727008/ https://www.ncbi.nlm.nih.gov/pubmed/31508469 http://dx.doi.org/10.1016/j.dib.2019.104397 |
work_keys_str_mv | AT pangestupeter imageswithhiddeninformationdatasetforinformationretrievalusage AT gunawandennis imageswithhiddeninformationdatasetforinformationretrievalusage AT hansunseng imageswithhiddeninformationdatasetforinformationretrievalusage |