Cargando…

Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI

The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic featu...

Descripción completa

Detalles Bibliográficos
Autores principales: Meng, Lu, Yang, Chuanhao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10604156/
https://www.ncbi.nlm.nih.gov/pubmed/37892847
http://dx.doi.org/10.3390/bioengineering10101117
_version_ 1785126769133617152
author Meng, Lu
Yang, Chuanhao
author_facet Meng, Lu
Yang, Chuanhao
author_sort Meng, Lu
collection PubMed
description The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research.
format Online
Article
Text
id pubmed-10604156
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106041562023-10-28 Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI Meng, Lu Yang, Chuanhao Bioengineering (Basel) Article The reconstruction of visual stimuli from fMRI signals, which record brain activity, is a challenging task with crucial research value in the fields of neuroscience and machine learning. Previous studies tend to emphasize reconstructing pixel-level features (contours, colors, etc.) or semantic features (object category) of the stimulus image, but typically, these properties are not reconstructed together. In this context, we introduce a novel three-stage visual reconstruction approach called the Dual-guided Brain Diffusion Model (DBDM). Initially, we employ the Very Deep Variational Autoencoder (VDVAE) to reconstruct a coarse image from fMRI data, capturing the underlying details of the original image. Subsequently, the Bootstrapping Language-Image Pre-training (BLIP) model is utilized to provide a semantic annotation for each image. Finally, the image-to-image generation pipeline of the Versatile Diffusion (VD) model is utilized to recover natural images from the fMRI patterns guided by both visual and semantic information. The experimental results demonstrate that DBDM surpasses previous approaches in both qualitative and quantitative comparisons. In particular, the best performance is achieved by DBDM in reconstructing the semantic details of the original image; the Inception, CLIP and SwAV distances are 0.611, 0.225 and 0.405, respectively. This confirms the efficacy of our model and its potential to advance visual decoding research. MDPI 2023-09-24 /pmc/articles/PMC10604156/ /pubmed/37892847 http://dx.doi.org/10.3390/bioengineering10101117 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Meng, Lu
Yang, Chuanhao
Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI
title Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI
title_full Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI
title_fullStr Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI
title_full_unstemmed Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI
title_short Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI
title_sort dual-guided brain diffusion model: natural image reconstruction from human visual stimulus fmri
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10604156/
https://www.ncbi.nlm.nih.gov/pubmed/37892847
http://dx.doi.org/10.3390/bioengineering10101117
work_keys_str_mv AT menglu dualguidedbraindiffusionmodelnaturalimagereconstructionfromhumanvisualstimulusfmri
AT yangchuanhao dualguidedbraindiffusionmodelnaturalimagereconstructionfromhumanvisualstimulusfmri