Cargando…

A vast dataset for Kurdish handwritten digits and isolated characters recognition

This article presents two massive datasets for central Kurdish handwriting digits and isolated characters named K-ZHMARA and K-PIT. The first dataset, named K-ZHMARA dataset, contains 70,000 images of Kurdish digits, 7000 images for each digit, and a printed A4 paper with a grid of 10 × 10 is used f...

Descripción completa

Detalles Bibliográficos
Autores principales: Abdalla, Peshraw Ahmed, Qadir, Abdalbasit Mohammed, Shakor, Mohammed Y., Saeed, Ari M., Jabar, Abdalla Taha, Salam, Ali Abdalla, Amin, Hedi Hamid Hama
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10018436/
https://www.ncbi.nlm.nih.gov/pubmed/36936638
http://dx.doi.org/10.1016/j.dib.2023.109014
_version_ 1784907808441892864
author Abdalla, Peshraw Ahmed
Qadir, Abdalbasit Mohammed
Shakor, Mohammed Y.
Saeed, Ari M.
Jabar, Abdalla Taha
Salam, Ali Abdalla
Amin, Hedi Hamid Hama
author_facet Abdalla, Peshraw Ahmed
Qadir, Abdalbasit Mohammed
Shakor, Mohammed Y.
Saeed, Ari M.
Jabar, Abdalla Taha
Salam, Ali Abdalla
Amin, Hedi Hamid Hama
author_sort Abdalla, Peshraw Ahmed
collection PubMed
description This article presents two massive datasets for central Kurdish handwriting digits and isolated characters named K-ZHMARA and K-PIT. The first dataset, named K-ZHMARA dataset, contains 70,000 images of Kurdish digits, 7000 images for each digit, and a printed A4 paper with a grid of 10 × 10 is used for data collection. Apart from digits, the K-PIT dataset includes 245,000 images of all Kurdish characters, 7000 images for each character; data was collected via a printed A4 paper with a grid of 12 × 10 for this dataset. Moreover, both datasets include 315,000 images. Python programming has been used to scan each piece of paper, segment, crop, resize, binarize, and invert the images via edge detection and image processing techniques.
format Online
Article
Text
id pubmed-10018436
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-100184362023-03-17 A vast dataset for Kurdish handwritten digits and isolated characters recognition Abdalla, Peshraw Ahmed Qadir, Abdalbasit Mohammed Shakor, Mohammed Y. Saeed, Ari M. Jabar, Abdalla Taha Salam, Ali Abdalla Amin, Hedi Hamid Hama Data Brief Data Article This article presents two massive datasets for central Kurdish handwriting digits and isolated characters named K-ZHMARA and K-PIT. The first dataset, named K-ZHMARA dataset, contains 70,000 images of Kurdish digits, 7000 images for each digit, and a printed A4 paper with a grid of 10 × 10 is used for data collection. Apart from digits, the K-PIT dataset includes 245,000 images of all Kurdish characters, 7000 images for each character; data was collected via a printed A4 paper with a grid of 12 × 10 for this dataset. Moreover, both datasets include 315,000 images. Python programming has been used to scan each piece of paper, segment, crop, resize, binarize, and invert the images via edge detection and image processing techniques. Elsevier 2023-03-02 /pmc/articles/PMC10018436/ /pubmed/36936638 http://dx.doi.org/10.1016/j.dib.2023.109014 Text en © 2023 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Abdalla, Peshraw Ahmed
Qadir, Abdalbasit Mohammed
Shakor, Mohammed Y.
Saeed, Ari M.
Jabar, Abdalla Taha
Salam, Ali Abdalla
Amin, Hedi Hamid Hama
A vast dataset for Kurdish handwritten digits and isolated characters recognition
title A vast dataset for Kurdish handwritten digits and isolated characters recognition
title_full A vast dataset for Kurdish handwritten digits and isolated characters recognition
title_fullStr A vast dataset for Kurdish handwritten digits and isolated characters recognition
title_full_unstemmed A vast dataset for Kurdish handwritten digits and isolated characters recognition
title_short A vast dataset for Kurdish handwritten digits and isolated characters recognition
title_sort vast dataset for kurdish handwritten digits and isolated characters recognition
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10018436/
https://www.ncbi.nlm.nih.gov/pubmed/36936638
http://dx.doi.org/10.1016/j.dib.2023.109014
work_keys_str_mv AT abdallapeshrawahmed avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT qadirabdalbasitmohammed avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT shakormohammedy avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT saeedarim avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT jabarabdallataha avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT salamaliabdalla avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT aminhedihamidhama avastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT abdallapeshrawahmed vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT qadirabdalbasitmohammed vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT shakormohammedy vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT saeedarim vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT jabarabdallataha vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT salamaliabdalla vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition
AT aminhedihamidhama vastdatasetforkurdishhandwrittendigitsandisolatedcharactersrecognition