Cargando…

Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?

The investigation of visual categorization has recently been aided by the introduction of deep convolutional neural networks (CNNs), which achieve unprecedented accuracy in picture classification after extensive training. Even if the architecture of CNNs is inspired by the organization of the visual...

Descripción completa

Detalles Bibliográficos
Autores principales: De Cesarei, Andrea, Cavicchi, Shari, Cristadoro, Giampaolo, Lippi, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8365760/
https://www.ncbi.nlm.nih.gov/pubmed/34170027
http://dx.doi.org/10.1111/cogs.13009
_version_ 1783738775317250048
author De Cesarei, Andrea
Cavicchi, Shari
Cristadoro, Giampaolo
Lippi, Marco
author_facet De Cesarei, Andrea
Cavicchi, Shari
Cristadoro, Giampaolo
Lippi, Marco
author_sort De Cesarei, Andrea
collection PubMed
description The investigation of visual categorization has recently been aided by the introduction of deep convolutional neural networks (CNNs), which achieve unprecedented accuracy in picture classification after extensive training. Even if the architecture of CNNs is inspired by the organization of the visual brain, the similarity between CNN and human visual processing remains unclear. Here, we investigated this issue by engaging humans and CNNs in a two‐class visual categorization task. To this end, pictures containing animals or vehicles were modified to contain only low/high spatial frequency (HSF) information, or were scrambled in the phase of the spatial frequency spectrum. For all types of degradation, accuracy increased as degradation was reduced for both humans and CNNs; however, the thresholds for accurate categorization varied between humans and CNNs. More remarkable differences were observed for HSF information compared to the other two types of degradation, both in terms of overall accuracy and image‐level agreement between humans and CNNs. The difficulty with which the CNNs were shown to categorize high‐passed natural scenes was reduced by picture whitening, a procedure which is inspired by how visual systems process natural images. The results are discussed concerning the adaptation to regularities in the visual environment (scene statistics); if the visual characteristics of the environment are not learned by CNNs, their visual categorization may depend only on a subset of the visual information on which humans rely, for example, on low spatial frequency information.
format Online
Article
Text
id pubmed-8365760
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-83657602021-08-23 Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes? De Cesarei, Andrea Cavicchi, Shari Cristadoro, Giampaolo Lippi, Marco Cogn Sci Regular Articles The investigation of visual categorization has recently been aided by the introduction of deep convolutional neural networks (CNNs), which achieve unprecedented accuracy in picture classification after extensive training. Even if the architecture of CNNs is inspired by the organization of the visual brain, the similarity between CNN and human visual processing remains unclear. Here, we investigated this issue by engaging humans and CNNs in a two‐class visual categorization task. To this end, pictures containing animals or vehicles were modified to contain only low/high spatial frequency (HSF) information, or were scrambled in the phase of the spatial frequency spectrum. For all types of degradation, accuracy increased as degradation was reduced for both humans and CNNs; however, the thresholds for accurate categorization varied between humans and CNNs. More remarkable differences were observed for HSF information compared to the other two types of degradation, both in terms of overall accuracy and image‐level agreement between humans and CNNs. The difficulty with which the CNNs were shown to categorize high‐passed natural scenes was reduced by picture whitening, a procedure which is inspired by how visual systems process natural images. The results are discussed concerning the adaptation to regularities in the visual environment (scene statistics); if the visual characteristics of the environment are not learned by CNNs, their visual categorization may depend only on a subset of the visual information on which humans rely, for example, on low spatial frequency information. John Wiley and Sons Inc. 2021-06-25 2021-06 /pmc/articles/PMC8365760/ /pubmed/34170027 http://dx.doi.org/10.1111/cogs.13009 Text en © 2021 The Authors. Cognitive Science published by Wiley Periodicals LLC on behalf of Cognitive Science Society (CSS). https://creativecommons.org/licenses/by/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Articles
De Cesarei, Andrea
Cavicchi, Shari
Cristadoro, Giampaolo
Lippi, Marco
Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?
title Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?
title_full Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?
title_fullStr Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?
title_full_unstemmed Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?
title_short Do Humans and Deep Convolutional Neural Networks Use Visual Information Similarly for the Categorization of Natural Scenes?
title_sort do humans and deep convolutional neural networks use visual information similarly for the categorization of natural scenes?
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8365760/
https://www.ncbi.nlm.nih.gov/pubmed/34170027
http://dx.doi.org/10.1111/cogs.13009
work_keys_str_mv AT decesareiandrea dohumansanddeepconvolutionalneuralnetworksusevisualinformationsimilarlyforthecategorizationofnaturalscenes
AT cavicchishari dohumansanddeepconvolutionalneuralnetworksusevisualinformationsimilarlyforthecategorizationofnaturalscenes
AT cristadorogiampaolo dohumansanddeepconvolutionalneuralnetworksusevisualinformationsimilarlyforthecategorizationofnaturalscenes
AT lippimarco dohumansanddeepconvolutionalneuralnetworksusevisualinformationsimilarlyforthecategorizationofnaturalscenes