Cargando…

A vast dataset for Kurdish handwritten digits and isolated characters recognition

This article presents two massive datasets for central Kurdish handwriting digits and isolated characters named K-ZHMARA and K-PIT. The first dataset, named K-ZHMARA dataset, contains 70,000 images of Kurdish digits, 7000 images for each digit, and a printed A4 paper with a grid of 10 × 10 is used f...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdalla, Peshraw Ahmed, Qadir, Abdalbasit Mohammed, Shakor, Mohammed Y., Saeed, Ari M., Jabar, Abdalla Taha, Salam, Ali Abdalla, Amin, Hedi Hamid Hama
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10018436/
https://www.ncbi.nlm.nih.gov/pubmed/36936638
http://dx.doi.org/10.1016/j.dib.2023.109014
Descripción
Sumario:This article presents two massive datasets for central Kurdish handwriting digits and isolated characters named K-ZHMARA and K-PIT. The first dataset, named K-ZHMARA dataset, contains 70,000 images of Kurdish digits, 7000 images for each digit, and a printed A4 paper with a grid of 10 × 10 is used for data collection. Apart from digits, the K-PIT dataset includes 245,000 images of all Kurdish characters, 7000 images for each character; data was collected via a printed A4 paper with a grid of 12 × 10 for this dataset. Moreover, both datasets include 315,000 images. Python programming has been used to scan each piece of paper, segment, crop, resize, binarize, and invert the images via edge detection and image processing techniques.