Cargando…

Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks

Human visual recognition is outstandingly robust. People can recognize thousands of object classes in the blink of an eye (50–200 ms) even when the objects vary in position, scale, viewpoint, and illumination. What aspects of human category learning facilitate the extraction of invariant visual feat...

Descripción completa

Detalles Bibliográficos
Autores principales: Ahn, Seoyoung, Zelinsky, Gregory J., Lupyan, Gary
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Association for Research in Vision and Ophthalmology 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8727315/
https://www.ncbi.nlm.nih.gov/pubmed/34967860
http://dx.doi.org/10.1167/jov.21.13.13
_version_ 1784626497367048192
author Ahn, Seoyoung
Zelinsky, Gregory J.
Lupyan, Gary
author_facet Ahn, Seoyoung
Zelinsky, Gregory J.
Lupyan, Gary
author_sort Ahn, Seoyoung
collection PubMed
description Human visual recognition is outstandingly robust. People can recognize thousands of object classes in the blink of an eye (50–200 ms) even when the objects vary in position, scale, viewpoint, and illumination. What aspects of human category learning facilitate the extraction of invariant visual features for object recognition? Here, we explore the possibility that a contributing factor to learning such robust visual representations may be a taxonomic hierarchy communicated in part by common labels to which people are exposed as part of natural language. We did this by manipulating the taxonomic level of labels (e.g., superordinate-level [mammal, fruit, vehicle] and basic-level [dog, banana, van]), and the order in which these training labels were used during learning by a Convolutional Neural Network. We found that training the model with hierarchical labels yields visual representations that are more robust to image transformations (e.g., position/scale, illumination, noise, and blur), especially when images were first trained with superordinate labels and then fine-tuned with basic labels. We also found that Superordinate-label followed by Basic-label training best predicts functional magnetic resonance imaging responses in visual cortex and behavioral similarity judgments recorded while viewing naturalistic images. The benefits of training with superordinate labels in the earlier stages of category learning is discussed in the context of representational efficiency and generalization.
format Online
Article
Text
id pubmed-8727315
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher The Association for Research in Vision and Ophthalmology
record_format MEDLINE/PubMed
spelling pubmed-87273152022-01-14 Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks Ahn, Seoyoung Zelinsky, Gregory J. Lupyan, Gary J Vis Article Human visual recognition is outstandingly robust. People can recognize thousands of object classes in the blink of an eye (50–200 ms) even when the objects vary in position, scale, viewpoint, and illumination. What aspects of human category learning facilitate the extraction of invariant visual features for object recognition? Here, we explore the possibility that a contributing factor to learning such robust visual representations may be a taxonomic hierarchy communicated in part by common labels to which people are exposed as part of natural language. We did this by manipulating the taxonomic level of labels (e.g., superordinate-level [mammal, fruit, vehicle] and basic-level [dog, banana, van]), and the order in which these training labels were used during learning by a Convolutional Neural Network. We found that training the model with hierarchical labels yields visual representations that are more robust to image transformations (e.g., position/scale, illumination, noise, and blur), especially when images were first trained with superordinate labels and then fine-tuned with basic labels. We also found that Superordinate-label followed by Basic-label training best predicts functional magnetic resonance imaging responses in visual cortex and behavioral similarity judgments recorded while viewing naturalistic images. The benefits of training with superordinate labels in the earlier stages of category learning is discussed in the context of representational efficiency and generalization. The Association for Research in Vision and Ophthalmology 2021-12-30 /pmc/articles/PMC8727315/ /pubmed/34967860 http://dx.doi.org/10.1167/jov.21.13.13 Text en Copyright 2021 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
spellingShingle Article
Ahn, Seoyoung
Zelinsky, Gregory J.
Lupyan, Gary
Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
title Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
title_full Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
title_fullStr Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
title_full_unstemmed Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
title_short Use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
title_sort use of superordinate labels yields more robust and human-like visual representations in convolutional neural networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8727315/
https://www.ncbi.nlm.nih.gov/pubmed/34967860
http://dx.doi.org/10.1167/jov.21.13.13
work_keys_str_mv AT ahnseoyoung useofsuperordinatelabelsyieldsmorerobustandhumanlikevisualrepresentationsinconvolutionalneuralnetworks
AT zelinskygregoryj useofsuperordinatelabelsyieldsmorerobustandhumanlikevisualrepresentationsinconvolutionalneuralnetworks
AT lupyangary useofsuperordinatelabelsyieldsmorerobustandhumanlikevisualrepresentationsinconvolutionalneuralnetworks