论文标题
使用实例调节的gans重建来自fMRI模式和语义大脑探索的感知图像
Reconstruction of Perceived Images from fMRI Patterns and Semantic Brain Exploration using Instance-Conditioned GANs
论文作者
论文摘要
从fMRI信号中重建感知的自然图像是神经解码研究中最吸引人的主题之一。先前的研究在重建低级图像特征或语义/高级方面方面取得了成功,但很少两者兼而有之。在这项研究中,我们利用了实例调节的GAN(IC-GAN)模型来重建来自fMRI模式的图像,这些图像具有准确的语义属性和保留的低级细节。 IC-GAN模型作为输入为119-DIM噪声向量和2048-DIM实例的特征向量通过自我监督的学习模型(SWAV Resnet-50)从目标图像中提取的2048-DIM实例矢量;这些实例特征是IC-GAN图像生成的条件,而噪声矢量引入了样本之间的变异性。我们从相应的fMRI模式中训练了脊回归模型,以预测刺激的实例特征,噪声向量和密集的矢量(IC-GAN发生器的第一个密集层的输出)。然后,我们使用IC-GAN发生器根据这些fMRI预测的变量重建新的测试图像。生成的图像在捕获原始测试图像的语义属性方面产生了最新的最新图像,同时仍然相对忠于低级图像细节。最后,我们使用学习的回归模型和IC-GAN发生器系统地探索和可视化语义特征,从而最大程度地驱动了人类大脑中几个利益区域中的各个区域。
Reconstructing perceived natural images from fMRI signals is one of the most engaging topics of neural decoding research. Prior studies had success in reconstructing either the low-level image features or the semantic/high-level aspects, but rarely both. In this study, we utilized an Instance-Conditioned GAN (IC-GAN) model to reconstruct images from fMRI patterns with both accurate semantic attributes and preserved low-level details. The IC-GAN model takes as input a 119-dim noise vector and a 2048-dim instance feature vector extracted from a target image via a self-supervised learning model (SwAV ResNet-50); these instance features act as a conditioning for IC-GAN image generation, while the noise vector introduces variability between samples. We trained ridge regression models to predict instance features, noise vectors, and dense vectors (the output of the first dense layer of the IC-GAN generator) of stimuli from corresponding fMRI patterns. Then, we used the IC-GAN generator to reconstruct novel test images based on these fMRI-predicted variables. The generated images presented state-of-the-art results in terms of capturing the semantic attributes of the original test images while remaining relatively faithful to low-level image details. Finally, we use the learned regression model and the IC-GAN generator to systematically explore and visualize the semantic features that maximally drive each of several regions-of-interest in the human brain.