Cargando…
A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels
This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8713117/ https://www.ncbi.nlm.nih.gov/pubmed/34993287 http://dx.doi.org/10.1016/j.dib.2021.107723 |
_version_ | 1784623706332463104 |
---|---|
author | Prashanth, Duddela Sai Mehta, R Vasanth Kumar Challa, Nagendra Panini |
author_facet | Prashanth, Duddela Sai Mehta, R Vasanth Kumar Challa, Nagendra Panini |
author_sort | Prashanth, Duddela Sai |
collection | PubMed |
description | This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals and 1400 for each vowel. Collected samples are digitized and pre-processed. During pre-processing, images with noise are removed. In this context, a final dataset of 38,750 images were included, where 2,250 and 1,250 samples for each numeral and vowel, respectively. The data is available in images and comma-separated-values, along with attached labels. The dataset could be used for Optical Character Recognition research and deep learning. In India, the Devanagari script is the base script on which 120+ languages are evolved; hence this dataset serves as the base for Machine Learning research in these languages. The data set is publicly available at https://data.mendeley.com/datasets/pxrnvp4yy8/2. |
format | Online Article Text |
id | pubmed-8713117 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-87131172022-01-05 A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels Prashanth, Duddela Sai Mehta, R Vasanth Kumar Challa, Nagendra Panini Data Brief Data Article This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals and 1400 for each vowel. Collected samples are digitized and pre-processed. During pre-processing, images with noise are removed. In this context, a final dataset of 38,750 images were included, where 2,250 and 1,250 samples for each numeral and vowel, respectively. The data is available in images and comma-separated-values, along with attached labels. The dataset could be used for Optical Character Recognition research and deep learning. In India, the Devanagari script is the base script on which 120+ languages are evolved; hence this dataset serves as the base for Machine Learning research in these languages. The data set is publicly available at https://data.mendeley.com/datasets/pxrnvp4yy8/2. Elsevier 2021-12-16 /pmc/articles/PMC8713117/ /pubmed/34993287 http://dx.doi.org/10.1016/j.dib.2021.107723 Text en © 2021 Published by Elsevier Inc. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Data Article Prashanth, Duddela Sai Mehta, R Vasanth Kumar Challa, Nagendra Panini A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels |
title | A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels |
title_full | A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels |
title_fullStr | A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels |
title_full_unstemmed | A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels |
title_short | A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels |
title_sort | multi-purpose dataset of devanagari script comprising of isolated numerals and vowels |
topic | Data Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8713117/ https://www.ncbi.nlm.nih.gov/pubmed/34993287 http://dx.doi.org/10.1016/j.dib.2021.107723 |
work_keys_str_mv | AT prashanthduddelasai amultipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels AT mehtarvasanthkumar amultipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels AT challanagendrapanini amultipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels AT prashanthduddelasai multipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels AT mehtarvasanthkumar multipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels AT challanagendrapanini multipurposedatasetofdevanagariscriptcomprisingofisolatednumeralsandvowels |