Cargando…

Pashtu Language Digits Dataset

Pashtu is a language spoken by 50 million people in the world [1]. It is the national language of Afghanistan and also spoken in the two largest provinces of Pakistan. It is a language written in complex way by calligraphers. Instead of enormous literature and research work in Optical Character Reco...

Descripción completa

Detalles Bibliográficos
Autores principales: Khan, Rehan Ullah, Khan, Khalil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679712/
https://www.ncbi.nlm.nih.gov/pubmed/36425990
http://dx.doi.org/10.1016/j.dib.2022.108701
_version_ 1784834257148968960
author Khan, Rehan Ullah
Khan, Khalil
author_facet Khan, Rehan Ullah
Khan, Khalil
author_sort Khan, Rehan Ullah
collection PubMed
description Pashtu is a language spoken by 50 million people in the world [1]. It is the national language of Afghanistan and also spoken in the two largest provinces of Pakistan. It is a language written in complex way by calligraphers. Instead of enormous literature and research work in Optical Character Recognition for other languages of the world, this language still requires a mature optical character recognition system [2], [3]. A real dataset of Pashtu digits having 50000 scanned images is introduced and made publically available in this paper. All the digits in the images are handwritten images written and collected from faculty members, staff, and students of the Pak-Austria Fachhochschule, Institute of Applied Sciences and Technology, Pakistan. A total of 1250 candidates appeared in writing the text, out of which half are male and half female. The dataset will be publically available for research purposes.
format Online
Article
Text
id pubmed-9679712
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-96797122022-11-23 Pashtu Language Digits Dataset Khan, Rehan Ullah Khan, Khalil Data Brief Data Article Pashtu is a language spoken by 50 million people in the world [1]. It is the national language of Afghanistan and also spoken in the two largest provinces of Pakistan. It is a language written in complex way by calligraphers. Instead of enormous literature and research work in Optical Character Recognition for other languages of the world, this language still requires a mature optical character recognition system [2], [3]. A real dataset of Pashtu digits having 50000 scanned images is introduced and made publically available in this paper. All the digits in the images are handwritten images written and collected from faculty members, staff, and students of the Pak-Austria Fachhochschule, Institute of Applied Sciences and Technology, Pakistan. A total of 1250 candidates appeared in writing the text, out of which half are male and half female. The dataset will be publically available for research purposes. Elsevier 2022-10-26 /pmc/articles/PMC9679712/ /pubmed/36425990 http://dx.doi.org/10.1016/j.dib.2022.108701 Text en © 2022 The Author(s) https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Khan, Rehan Ullah
Khan, Khalil
Pashtu Language Digits Dataset
title Pashtu Language Digits Dataset
title_full Pashtu Language Digits Dataset
title_fullStr Pashtu Language Digits Dataset
title_full_unstemmed Pashtu Language Digits Dataset
title_short Pashtu Language Digits Dataset
title_sort pashtu language digits dataset
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9679712/
https://www.ncbi.nlm.nih.gov/pubmed/36425990
http://dx.doi.org/10.1016/j.dib.2022.108701
work_keys_str_mv AT khanrehanullah pashtulanguagedigitsdataset
AT khankhalil pashtulanguagedigitsdataset