Cargando…

A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity

Research on visual encoding models for functional magnetic resonance imaging derived from deep neural networks, especially CNN (e.g., VGG16), has been developed. However, CNNs typically use smaller kernel sizes (e.g., 3 × 3) for feature extraction in visual encoding models. Although the receptive fi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ma, Shuxiao, Wang, Linyuan, Chen, Panpan, Qin, Ruoxi, Hou, Libin, Yan, Bin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9775903/ https://www.ncbi.nlm.nih.gov/pubmed/36552093 http://dx.doi.org/10.3390/brainsci12121633

_version_	1784855747719331840
author	Ma, Shuxiao Wang, Linyuan Chen, Panpan Qin, Ruoxi Hou, Libin Yan, Bin
author_facet	Ma, Shuxiao Wang, Linyuan Chen, Panpan Qin, Ruoxi Hou, Libin Yan, Bin
author_sort	Ma, Shuxiao
collection	PubMed
description	Research on visual encoding models for functional magnetic resonance imaging derived from deep neural networks, especially CNN (e.g., VGG16), has been developed. However, CNNs typically use smaller kernel sizes (e.g., 3 × 3) for feature extraction in visual encoding models. Although the receptive field size of CNN can be enlarged by increasing the network depth or subsampling, it is limited by the small size of the convolution kernel, leading to an insufficient receptive field size. In biological research, the size of the neuronal population receptive field of high-level visual encoding regions is usually three to four times that of low-level visual encoding regions. Thus, CNNs with a larger receptive field size align with the biological findings. The RepLKNet model directly expands the convolution kernel size to obtain a larger-scale receptive field. Therefore, this paper proposes a mixed model to replace CNN for feature extraction in visual encoding models. The proposed model mixes RepLKNet and VGG so that the mixed model has a receptive field of different sizes to extract more feature information from the image. The experimental results indicate that the mixed model achieves better encoding performance in multiple regions of the visual cortex than the traditional convolutional model. Also, a larger-scale receptive field should be considered in building visual encoding models so that the convolution network can play a more significant role in visual representations.
format	Online Article Text
id	pubmed-9775903
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-97759032022-12-23 A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity Ma, Shuxiao Wang, Linyuan Chen, Panpan Qin, Ruoxi Hou, Libin Yan, Bin Brain Sci Article Research on visual encoding models for functional magnetic resonance imaging derived from deep neural networks, especially CNN (e.g., VGG16), has been developed. However, CNNs typically use smaller kernel sizes (e.g., 3 × 3) for feature extraction in visual encoding models. Although the receptive field size of CNN can be enlarged by increasing the network depth or subsampling, it is limited by the small size of the convolution kernel, leading to an insufficient receptive field size. In biological research, the size of the neuronal population receptive field of high-level visual encoding regions is usually three to four times that of low-level visual encoding regions. Thus, CNNs with a larger receptive field size align with the biological findings. The RepLKNet model directly expands the convolution kernel size to obtain a larger-scale receptive field. Therefore, this paper proposes a mixed model to replace CNN for feature extraction in visual encoding models. The proposed model mixes RepLKNet and VGG so that the mixed model has a receptive field of different sizes to extract more feature information from the image. The experimental results indicate that the mixed model achieves better encoding performance in multiple regions of the visual cortex than the traditional convolutional model. Also, a larger-scale receptive field should be considered in building visual encoding models so that the convolution network can play a more significant role in visual representations. MDPI 2022-11-29 /pmc/articles/PMC9775903/ /pubmed/36552093 http://dx.doi.org/10.3390/brainsci12121633 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Ma, Shuxiao Wang, Linyuan Chen, Panpan Qin, Ruoxi Hou, Libin Yan, Bin A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity
title	A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity
title_full	A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity
title_fullStr	A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity
title_full_unstemmed	A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity
title_short	A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity
title_sort	mixed visual encoding model based on the larger-scale receptive field for human brain activity
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9775903/ https://www.ncbi.nlm.nih.gov/pubmed/36552093 http://dx.doi.org/10.3390/brainsci12121633
work_keys_str_mv	AT mashuxiao amixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT wanglinyuan amixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT chenpanpan amixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT qinruoxi amixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT houlibin amixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT yanbin amixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT mashuxiao mixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT wanglinyuan mixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT chenpanpan mixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT qinruoxi mixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT houlibin mixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity AT yanbin mixedvisualencodingmodelbasedonthelargerscalereceptivefieldforhumanbrainactivity

A Mixed Visual Encoding Model Based on the Larger-Scale Receptive Field for Human Brain Activity

Ejemplares similares