|

Spatio-Temporal Scale Coded Bag-of-Words.

Researchers

Journal

Modalities

Models

Abstract

The Bag-of-Words (BoW) framework has been widely used in action recognition tasks due to its compact and efficient feature representation. Various modifications have been made to this framework to increase its classification power. This often results in an increased complexity and reduced efficiency. Inspired by the success of image-based scale coded BoW representations, we propose a spatio-temporal scale coded BoW (SC-BoW) for video-based recognition. This involves encoding extracted multi-scale information into BoW representations by partitioning spatio-temporal features into sub-groups based on the spatial scale from which they were extracted. We evaluate SC-BoW in two experimental setups. We first present a general pipeline to perform real-time action recognition with SC-BoW. Secondly, we apply SC-BoW onto the popular Dense Trajectory feature set. Results showed SC-BoW representations to successfully improve performance by 2-7% with low added computational cost. Notably, SC-BoW on Dense Trajectories outperformed more complex deep learning approaches. Thus, scale coding is a low-cost and low-level encoding scheme that increases classification power of the standard BoW without compromising efficiency.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *