Cargando…
MiBio: A dataset for OCR post-processing evaluation
We introduce a dataset for OCR post-processing model evaluation. This dataset contains fully aligned OCR texts and the ground truth recognition texts of a English biodiversity book. To better used for benchmark evaluation, we extracted the following information in TSV files: 1) 2907 OCR-generated er...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6197712/ https://www.ncbi.nlm.nih.gov/pubmed/30364639 http://dx.doi.org/10.1016/j.dib.2018.08.099 |