A multi-purpose dataset of Devanagari script comprising of isolated numerals and vowels.

Researchers

Journal

Modalities

Models

Abstract

This article presents handwritten isolated characters of the Devanagari script. Devanagari script contains ten numerals, 13 vowels, and 33 consonants. Devanagari Character dataset includes 23 different characters of numerals and vowels. 2400 handwritten samples are collected for each of the numerals and 1400 for each vowel. Collected samples are digitized and pre-processed. During pre-processing, images with noise are removed. In this context, a final dataset of 38,750 images were included, where 2,250 and 1,250 samples for each numeral and vowel, respectively. The data is available in images and comma-separated-values, along with attached labels. The dataset could be used for Optical Character Recognition research and deep learning. In India, the Devanagari script is the base script on which 120+ languages are evolved; hence this dataset serves as the base for Machine Learning research in these languages. The data set is publicly available at https://data.mendeley.com/datasets/pxrnvp4yy8/2.© 2021 Published by Elsevier Inc.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *