Affect recognition from facial movements and body gestures by hierarchical deep spatio-temporal features and fusion strategy.

Researchers

Journal

Neural networks : the official journal of the International Neural Network Society

Modalities

Models

bilateral long short-term memory recurrent neural networks (BLSTM-RNN)Convolutional Neural Networks (CNN)principal component analysis (PCA)

Abstract

Affect presentation is periodic and multi-modal, such as through facial movements, body gestures, and so on. Studies have shown that temporal selection and multi-modal combinations may benefit affect recognition. In this article, we therefore propose a spatio-temporal fusion model that extracts spatio-temporal hierarchical features based on select expressive components. In addition, a multi-modal hierarchical fusion strategy is presented. Our model learns the spatio-temporal hierarchical features from videos by a proposed deep network, which combines a convolutional neural networks (CNN), bilateral long short-term memory recurrent neural networks (BLSTM-RNN) with principal component analysis (PCA). Our approach handles each video as a “video sentence.” It first obtains a skeleton with the temporal selection process and then segments key words with a certain sliding window. Finally, it obtains the features with a deep network comprised of a video-skeleton and video-words. Our model combines the feature level and decision level fusion for fusing the multi-modal information. Experimental results showed that our model improved the multi-modal affect recognition accuracy rate from 95.13% in existing literature to 99.57% on a face and body (FABO) database, our results have been increased by 4.44%, and it obtained a macro average accuracy (MAA) up to 99.71%.
Copyright © 2017 Elsevier Ltd. All rights reserved.

Show Full Text

Affect recognition from facial movements and body gestures by hierarchical deep spatio-temporal features and fusion strategy.

Researchers

Journal

Modalities

Models

Abstract

The Classification of Stages of Epiretinal Membrane using Convolutional Neural Network on Optical Coherence Tomography Image.

Fully automated deep learning model for detecting proximity of mandibular third molar root to inferior alveolar canal using panoramic radiographs.

Deep learning-based compressed SENSE improved diffusion-weighted image quality and liver cancer detection: A prospective study.

Evaluation of Deep Learning Clinical Target Volumes Auto-Contouring for Magnetic Resonance Imaging-Guided Online Adaptive Treatment of Rectal Cancer.

A Feasibility Study of Dose Band Prediction in Radiotherapy: Predicting a Dose Spectrum.

Deep learning analysis for differential diagnosis and risk classification of gastrointestinal tumors.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply