Integrating Spatial and Temporal Information for Violent Activity Detection from Video Using Deep Spiking Neural Networks.

May 13, 2023 Computer Vision, Neural Networks, Video Analysis

Abstract

Increasing violence in workplaces such as hospitals seriously challenges public safety. However, it is time- and labor-consuming to visually monitor masses of video data in real time. Therefore, automatic and timely violent activity detection from videos is vital, especially for small monitoring systems. This paper proposes a two-stream deep learning architecture for video violent activity detection named SpikeConvFlowNet. First, RGB frames and their optical flow data are used as inputs for each stream to extract the spatiotemporal features of videos. After that, the spatiotemporal features from the two streams are concatenated and fed to the classifier for the final decision. Each stream utilizes a supervised neural network consisting of multiple convolutional spiking and pooling layers. Convolutional layers are used to extract high-quality spatial features within frames, and spiking neurons can efficiently extract temporal features across frames by remembering historical information. The spiking neuron-based optical flow can strengthen the capability of extracting critical motion information. This method combines their advantages to enhance the performance and efficiency for recognizing violent actions. The experimental results on public datasets demonstrate that, compared with the latest methods, this approach greatly reduces parameters and achieves higher inference efficiency with limited accuracy loss. It is a potential solution for applications in embedded devices that provide low computing power but require fast processing speeds.

Show Full Text

Integrating Spatial and Temporal Information for Violent Activity Detection from Video Using Deep Spiking Neural Networks.

Researchers

Journal

Modalities

Models

Abstract

Low-Dose CT With a Residual Encoder-Decoder Convolutional Neural Network.

Wireless body area sensor networks based human activity recognition using deep learning.

VISTA: VIsual Semantic Tissue Analysis for pancreatic disease quantification in murine cohorts.

Deep Learning in the Identification of Electroencephalogram Sources Associated with Sexual Orientation.

Noncontact Sleep Monitoring With Infrared Video Data to Estimate Sleep Apnea Severity and Distinguish Between Positional and Nonpositional Sleep Apnea: Model Development and Experimental Validation.

A fidelity-embedded learning for metal artifact reduction in dental CBCT.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply