Other

ET-Network: A novel efficient transformer deep learning model for automated Urdu handwritten text recognition.

May 17, 2024 Other

Researchers

Journal

PloS one

Modalities

Models

EfficientNet Transformer

Abstract

Automatic Urdu handwritten text recognition is a challenging task in the OCR industry. Unlike printed text, Urdu handwriting lacks a uniform font and structure. This lack of uniformity causes data inconsistencies and recognition issues. Different writing styles, cursive scripts, and limited data make Urdu text recognition a complicated task. Major languages, such as English, have experienced advances in automated recognition, whereas low-resource languages, such as Urdu, still lag. Transformer-based models are promising for automated recognition in high- and low-resource languages such as Urdu. This paper presents a transformer-based method called ET-Network that integrates self-attention into EfficientNet for feature extraction and a transformer for language modeling. The use of self-attention layers in EfficientNet helps to extract global and local features that capture long-range dependencies. These features proceeded into a vanilla transformer to generate text, and a prefix beam search is used for the finest outcome. NUST-UHWR, UPTI2.0, and MMU-OCR-21 are three datasets used to train and test the ET Network for a handwritten Urdu script. The ET-Network improved the character error rate by 4% and the word error rate by 1.55%, while establishing a new state-of-the-art character error rate of 5.27% and a word error rate of 19.09% for Urdu handwritten text.Copyright: © 2024 Hamza et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Show Full Text

ET-Network: A novel efficient transformer deep learning model for automated Urdu handwritten text recognition.

Researchers

Journal

Modalities

Models

Abstract

Deep learning-based Phase Measuring Deflectometry for single-shot 3D shape measurement and defect detection of specular objects.

SIRe-Networks: Convolutional neural networks architectural extension for information preservation via skip/residual connections and interlaced auto-encoders.

Visual Pretraining via Contrastive Predictive Model for Pixel-Based Reinforcement Learning.

Imitation and mirror systems in robots through Deep Modality Blending Networks.

Multi-Night at-Home Evaluation of Improved Sleep Detection and Classification with a Memory-Enhanced Consumer Sleep Tracker.

Development of an Autonomous Unmanned Aerial Manipulator Based on a Real-Time Oriented-Object Detection Method.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply