Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI.

Abstract

The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research.

Show Full Text

Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI.

Researchers

Journal

Modalities

Models

Abstract

Generative Adversarial Networks and Perceptual Losses for Video Super-Resolution.

Multi-omics deep learning for radiation pneumonitis prediction in lung cancer patients underwent volumetric modulated arc therapy.

Disentanglement by Cyclic Reconstruction.

Winter wheat ear counting based on improved YOLOv7x and Kalman filter tracking algorithm with video streaming.

Understanding Short-Range Memory Effects in Deep Neural Networks.

Automated BI-RADS classification of lesions using pyramid triple deep feature generator technique on breast ultrasound images.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply