Ophthalmology

Gaze Estimation Based on Convolutional Structure and Sliding Window-Based Attention Mechanism.

July 14, 2023 Ophthalmology

Abstract

The direction of human gaze is an important indicator of human behavior, reflecting the level of attention and cognitive state towards various visual stimuli in the environment. Convolutional neural networks have achieved good performance in gaze estimation tasks, but their global modeling capability is limited, making it difficult to further improve prediction performance. In recent years, transformer models have been introduced for gaze estimation and have achieved state-of-the-art performance. However, their slicing-and-mapping mechanism for processing local image patches can compromise local spatial information. Moreover, the single down-sampling rate and fixed-size tokens are not suitable for multiscale feature learning in gaze estimation tasks. To overcome these limitations, this study introduces a Swin Transformer for gaze estimation and designs two network architectures: a pure Swin Transformer gaze estimation model (SwinT-GE) and a hybrid gaze estimation model that combines convolutional structures with SwinT-GE (Res-Swin-GE). SwinT-GE uses the tiny version of the Swin Transformer for gaze estimation. Res-Swin-GE replaces the slicing-and-mapping mechanism of SwinT-GE with convolutional structures. Experimental results demonstrate that Res-Swin-GE significantly outperforms SwinT-GE, exhibiting strong competitiveness on the MpiiFaceGaze dataset and achieving a 7.5% performance improvement over existing state-of-the-art methods on the Eyediap dataset.

Show Full Text

Gaze Estimation Based on Convolutional Structure and Sliding Window-Based Attention Mechanism.

Researchers

Journal

Modalities

Models

Abstract

Enhanced Visualization of Retinal Microvasculature in Optical Coherence Tomography Angiography Imaging via Deep Learning.

Deep learning versus ophthalmologists for screening for glaucoma on fundus examination: a systematic review and meta-analysis.

Comprehensive assessment of the anterior segment in refraction corrected OCT based on multitask learning.

Video Capsule Endoscopy Classification using Focal Modulation Guided Convolutional Neural Network.

Image-Based Detection of Adulterants in Milk Using Convolutional Neural Network.

A novel denoising method for CT images based on U-net and multi-attention.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply