Cargando…

Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation

Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT...

Descripción completa

Detalles Bibliográficos
Autores principales: Khaligh-Razavi, Seyed-Mahdi, Kriegeskorte, Nikolaus
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4222664/
https://www.ncbi.nlm.nih.gov/pubmed/25375136
http://dx.doi.org/10.1371/journal.pcbi.1003915
_version_ 1782343078581895168
author Khaligh-Razavi, Seyed-Mahdi
Kriegeskorte, Nikolaus
author_facet Khaligh-Razavi, Seyed-Mahdi
Kriegeskorte, Nikolaus
author_sort Khaligh-Razavi, Seyed-Mahdi
collection PubMed
description Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT.
format Online
Article
Text
id pubmed-4222664
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-42226642014-11-13 Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation Khaligh-Razavi, Seyed-Mahdi Kriegeskorte, Nikolaus PLoS Comput Biol Research Article Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT. Public Library of Science 2014-11-06 /pmc/articles/PMC4222664/ /pubmed/25375136 http://dx.doi.org/10.1371/journal.pcbi.1003915 Text en © 2014 Khaligh-Razavi, Kriegeskorte http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Khaligh-Razavi, Seyed-Mahdi
Kriegeskorte, Nikolaus
Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
title Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
title_full Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
title_fullStr Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
title_full_unstemmed Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
title_short Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation
title_sort deep supervised, but not unsupervised, models may explain it cortical representation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4222664/
https://www.ncbi.nlm.nih.gov/pubmed/25375136
http://dx.doi.org/10.1371/journal.pcbi.1003915
work_keys_str_mv AT khalighrazaviseyedmahdi deepsupervisedbutnotunsupervisedmodelsmayexplainitcorticalrepresentation
AT kriegeskortenikolaus deepsupervisedbutnotunsupervisedmodelsmayexplainitcorticalrepresentation