Cargando…

From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction

Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While...

Descripción completa

Detalles Bibliográficos
Autores principales: Singer, Johannes J. D., Seeliger, Katja, Kietzmann, Tim C., Hebart, Martin N.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Association for Research in Vision and Ophthalmology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822363/
https://www.ncbi.nlm.nih.gov/pubmed/35129578
http://dx.doi.org/10.1167/jov.22.2.4
_version_ 1784646592853180416
author Singer, Johannes J. D.
Seeliger, Katja
Kietzmann, Tim C.
Hebart, Martin N.
author_facet Singer, Johannes J. D.
Seeliger, Katja
Kietzmann, Tim C.
Hebart, Martin N.
author_sort Singer, Johannes J. D.
collection PubMed
description Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While CNNs trained on natural images have been shown to exhibit poor classification performance on drawings, other work has demonstrated highly similar latent representations in the networks for abstracted and natural images. Here, we address these seemingly conflicting findings by analyzing the activation patterns of a CNN trained on natural images across a set of photographs, drawings, and sketches of the same objects and comparing them to human behavior. We find a highly similar representational structure across levels of visual abstraction in early and intermediate layers of the network. This similarity, however, does not translate to later stages in the network, resulting in low classification performance for drawings and sketches. We identified that texture bias in CNNs contributes to the dissimilar representational structure in late layers and the poor performance on drawings. Finally, by fine-tuning late network layers with object drawings, we show that performance can be largely restored, demonstrating the general utility of features learned on natural images in early and intermediate layers for the recognition of drawings. In conclusion, generalization to abstracted images, such as drawings, seems to be an emergent property of CNNs trained on natural images, which is, however, suppressed by domain-related biases that arise during later processing stages in the network.
format Online
Article
Text
id pubmed-8822363
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher The Association for Research in Vision and Ophthalmology
record_format MEDLINE/PubMed
spelling pubmed-88223632022-02-18 From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction Singer, Johannes J. D. Seeliger, Katja Kietzmann, Tim C. Hebart, Martin N. J Vis Article Line drawings convey meaning with just a few strokes. Despite strong simplifications, humans can recognize objects depicted in such abstracted images without effort. To what degree do deep convolutional neural networks (CNNs) mirror this human ability to generalize to abstracted object images? While CNNs trained on natural images have been shown to exhibit poor classification performance on drawings, other work has demonstrated highly similar latent representations in the networks for abstracted and natural images. Here, we address these seemingly conflicting findings by analyzing the activation patterns of a CNN trained on natural images across a set of photographs, drawings, and sketches of the same objects and comparing them to human behavior. We find a highly similar representational structure across levels of visual abstraction in early and intermediate layers of the network. This similarity, however, does not translate to later stages in the network, resulting in low classification performance for drawings and sketches. We identified that texture bias in CNNs contributes to the dissimilar representational structure in late layers and the poor performance on drawings. Finally, by fine-tuning late network layers with object drawings, we show that performance can be largely restored, demonstrating the general utility of features learned on natural images in early and intermediate layers for the recognition of drawings. In conclusion, generalization to abstracted images, such as drawings, seems to be an emergent property of CNNs trained on natural images, which is, however, suppressed by domain-related biases that arise during later processing stages in the network. The Association for Research in Vision and Ophthalmology 2022-02-07 /pmc/articles/PMC8822363/ /pubmed/35129578 http://dx.doi.org/10.1167/jov.22.2.4 Text en Copyright 2022 The Authors https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 International License.
spellingShingle Article
Singer, Johannes J. D.
Seeliger, Katja
Kietzmann, Tim C.
Hebart, Martin N.
From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
title From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
title_full From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
title_fullStr From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
title_full_unstemmed From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
title_short From photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
title_sort from photos to sketches - how humans and deep neural networks process objects across different levels of visual abstraction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8822363/
https://www.ncbi.nlm.nih.gov/pubmed/35129578
http://dx.doi.org/10.1167/jov.22.2.4
work_keys_str_mv AT singerjohannesjd fromphotostosketcheshowhumansanddeepneuralnetworksprocessobjectsacrossdifferentlevelsofvisualabstraction
AT seeligerkatja fromphotostosketcheshowhumansanddeepneuralnetworksprocessobjectsacrossdifferentlevelsofvisualabstraction
AT kietzmanntimc fromphotostosketcheshowhumansanddeepneuralnetworksprocessobjectsacrossdifferentlevelsofvisualabstraction
AT hebartmartinn fromphotostosketcheshowhumansanddeepneuralnetworksprocessobjectsacrossdifferentlevelsofvisualabstraction