Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers.

October 16, 2024 Cognitive Neuroscience, Neurology

Abstract

Deep convolutional neural networks (DCNNs) have attained human-level performance for object categorization and exhibited representation alignment between network layers and brain regions. Does such representation alignment naturally extend to other visual tasks beyond recognizing objects in static images? In this study, we expanded the exploration to the recognition of human actions from videos and assessed the representation capabilities and alignment of two-stream DCNNs in comparison with brain regions situated along ventral and dorsal pathways. Using decoding analysis and representational similarity analysis, we show that DCNN models do not show hierarchical representation alignment to human brain across visual regions when processing action videos. Instead, later layers of DCNN models demonstrate greater representation similarities to the human visual cortex. These findings were revealed for two display formats: photorealistic avatars with full-body information and simplified stimuli in the point-light display. The discrepancies in representation alignment suggest fundamental differences in how DCNNs and the human brain represent dynamic visual information related to actions.© 2024 Massachusetts Institute of Technology. Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Show Full Text

Human Visual Pathways for Action Recognition versus Deep Convolutional Neural Networks: Representation Correspondence in Late but Not Early Layers.

Researchers

Journal

Modalities

Models

Abstract

An early diagnosed cerebral small vessel disease in a 12-year-old girl.

Computational Approaches for Understanding Sequence Variation Effects on the 3D Genome Architecture.

Computational modeling of tumor invasion from limited and diverse data in Glioblastoma.

Convolutional Neural Network-Based Prediction of Axial Length Using Color Fundus Photography.

A multi-verse optimizer-based CNN-BiLSTM pixel-level detection model for peanut aflatoxins.

MCA-Net: Multi-feature coding and attention convolutional neural network for predicting lncRNA-disease association.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply