Res2Net-based multi-scale and multi-attention model for traffic scene image classification.

Abstract

With the increasing applications of traffic scene image classification in intelligent transportation systems, there is a growing demand for improved accuracy and robustness in this classification task. However, due to weather conditions, time, lighting variations, and annotation costs, traditional deep learning methods still have limitations in extracting complex traffic scene features and achieving higher recognition accuracy. The previous classification methods for traffic scene images had gaps in multi-scale feature extraction and the combination of frequency domain, spatial, and channel attention. To address these issues, this paper proposes a multi-scale and multi-attention model based on Res2Net. Our proposed framework introduces an Adaptive Feature Refinement Pyramid Module (AFRPM) to enhance multi-scale feature extraction, thus improving the accuracy of traffic scene image classification. Additionally, we integrate frequency domain and spatial-channel attention mechanisms to develop recognition capabilities for complex backgrounds, objects of different scales, and local details in traffic scene images. The paper conducts the task of classifying traffic scene images using the Traffic-Net dataset. The experimental results demonstrate that our model achieves an accuracy of 96.88% on this dataset, which is an improvement of approximately 2% compared to the baseline Res2Net network. Furthermore, we validate the effectiveness of the proposed modules through ablation experiments.Copyright: © 2024 Gao et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Show Full Text

Res2Net-based multi-scale and multi-attention model for traffic scene image classification.

Researchers

Journal

Modalities

Models

Abstract

MIX-NET: Deep Learning-Based Point Cloud Processing Method for Segmentation and Occlusion Leaf Restoration of Seedlings.

An Automated Image-Based Dietary Assessment System for Mediterranean Foods.

Small Object Detection in Traffic Scenes Based on YOLO-MXANet.

A Novel ST-YOLO Network for Steel-Surface-Defect Detection.

A quality grade classification method for fresh tea leaves based on an improved YOLOv8x-SPPCSPC-CBAM model.

Study on the Application of Visual Communication Design in APP Interface Design in the Context of Deep Learning.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply