Cargando…

MiBio: A dataset for OCR post-processing evaluation

We introduce a dataset for OCR post-processing model evaluation. This dataset contains fully aligned OCR texts and the ground truth recognition texts of a English biodiversity book. To better used for benchmark evaluation, we extracted the following information in TSV files: 1) 2907 OCR-generated er...

Descripción completa

Detalles Bibliográficos
Autores principales: Mei, Jie, Islam, Aminul, Moh’d, Abidalrahman, Wu, Yajing, Milios, Evangelos E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6197712/
https://www.ncbi.nlm.nih.gov/pubmed/30364639
http://dx.doi.org/10.1016/j.dib.2018.08.099