Cargando…

A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels

This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals...

Descripción completa

Detalles Bibliográficos
Autores principales: Prashanth, Duddela Sai, Mehta, R Vasanth Kumar, Challa, Nagendra Panini
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8713117/
https://www.ncbi.nlm.nih.gov/pubmed/34993287
http://dx.doi.org/10.1016/j.dib.2021.107723
_version_ 1784623706332463104
author Prashanth, Duddela Sai
Mehta, R Vasanth Kumar
Challa, Nagendra Panini
author_facet Prashanth, Duddela Sai
Mehta, R Vasanth Kumar
Challa, Nagendra Panini
author_sort Prashanth, Duddela Sai
collection PubMed
description This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals and 1400 for each vowel. Collected samples are digitized and pre-processed. During pre-processing, images with noise are removed. In this context, a final dataset of 38,750 images were included, where 2,250 and 1,250 samples for each numeral and vowel, respectively. The data is available in images and comma-separated-values, along with attached labels. The dataset could be used for Optical Character Recognition research and deep learning. In India, the Devanagari script is the base script on which 120+ languages are evolved; hence this dataset serves as the base for Machine Learning research in these languages. The data set is publicly available at https://data.mendeley.com/datasets/pxrnvp4yy8/2.
format Online
Article
Text
id pubmed-8713117
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-87131172022-01-05 A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels Prashanth, Duddela Sai Mehta, R Vasanth Kumar Challa, Nagendra Panini Data Brief Data Article This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals and 1400 for each vowel. Collected samples are digitized and pre-processed. During pre-processing, images with noise are removed. In this context, a final dataset of 38,750 images were included, where 2,250 and 1,250 samples for each numeral and vowel, respectively. The data is available in images and comma-separated-values, along with attached labels. The dataset could be used for Optical Character Recognition research and deep learning. In India, the Devanagari script is the base script on which 120+ languages are evolved; hence this dataset serves as the base for Machine Learning research in these languages. The data set is publicly available at https://data.mendeley.com/datasets/pxrnvp4yy8/2. Elsevier 2021-12-16 /pmc/articles/PMC8713117/ /pubmed/34993287 http://dx.doi.org/10.1016/j.dib.2021.107723 Text en © 2021 Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Data Article
Prashanth, Duddela Sai
Mehta, R Vasanth Kumar
Challa, Nagendra Panini
A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
title A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
title_full A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
title_fullStr A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
title_full_unstemmed A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
title_short A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
title_sort multi-purpose dataset of devanagari script comprising of isolated numerals and vowels
topic Data Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8713117/
https://www.ncbi.nlm.nih.gov/pubmed/34993287
http://dx.doi.org/10.1016/j.dib.2021.107723
work_keys_str_mv AT prashanthduddelasai amultipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels
AT mehtarvasanthkumar amultipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels
AT challanagendrapanini amultipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels
AT prashanthduddelasai multipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels
AT mehtarvasanthkumar multipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels
AT challanagendrapanini multipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels