Cargando…

Feedforward object-vision models only tolerate small image variations compared to human

Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ghodrati, Masoud, Farzmahdi, Amirhossein, Rajaei, Karim, Ebrahimpour, Reza, Khaligh-Razavi, Seyed-Mahdi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2014
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103258/ https://www.ncbi.nlm.nih.gov/pubmed/25100986 http://dx.doi.org/10.3389/fncom.2014.00074

_version_	1782327124797947904
author	Ghodrati, Masoud Farzmahdi, Amirhossein Rajaei, Karim Ebrahimpour, Reza Khaligh-Razavi, Seyed-Mahdi
author_facet	Ghodrati, Masoud Farzmahdi, Amirhossein Rajaei, Karim Ebrahimpour, Reza Khaligh-Razavi, Seyed-Mahdi
author_sort	Ghodrati, Masoud
collection	PubMed
description	Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex.
format	Online Article Text
id	pubmed-4103258
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-41032582014-08-06 Feedforward object-vision models only tolerate small image variations compared to human Ghodrati, Masoud Farzmahdi, Amirhossein Rajaei, Karim Ebrahimpour, Reza Khaligh-Razavi, Seyed-Mahdi Front Comput Neurosci Neuroscience Invariant object recognition is a remarkable ability of primates' visual system that its underlying mechanism has constantly been under intense investigations. Computational modeling is a valuable tool toward understanding the processes involved in invariant object recognition. Although recent computational models have shown outstanding performances on challenging image databases, they fail to perform well in image categorization under more complex image variations. Studies have shown that making sparse representation of objects by extracting more informative visual features through a feedforward sweep can lead to higher recognition performances. Here, however, we show that when the complexity of image variations is high, even this approach results in poor performance compared to humans. To assess the performance of models and humans in invariant object recognition tasks, we built a parametrically controlled image database consisting of several object categories varied in different dimensions and levels, rendered from 3D planes. Comparing the performance of several object recognition models with human observers shows that only in low-level image variations the models perform similar to humans in categorization tasks. Furthermore, the results of our behavioral experiments demonstrate that, even under difficult experimental conditions (i.e., briefly presented masked stimuli with complex image variations), human observers performed outstandingly well, suggesting that the models are still far from resembling humans in invariant object recognition. Taken together, we suggest that learning sparse informative visual features, although desirable, is not a complete solution for future progresses in object-vision modeling. We show that this approach is not of significant help in solving the computational crux of object recognition (i.e., invariant object recognition) when the identity-preserving image variations become more complex. Frontiers Media S.A. 2014-07-18 /pmc/articles/PMC4103258/ /pubmed/25100986 http://dx.doi.org/10.3389/fncom.2014.00074 Text en Copyright © 2014 Ghodrati, Farzmahdi, Rajaei, Ebrahimpour and Khaligh-Razavi. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Ghodrati, Masoud Farzmahdi, Amirhossein Rajaei, Karim Ebrahimpour, Reza Khaligh-Razavi, Seyed-Mahdi Feedforward object-vision models only tolerate small image variations compared to human
title	Feedforward object-vision models only tolerate small image variations compared to human
title_full	Feedforward object-vision models only tolerate small image variations compared to human
title_fullStr	Feedforward object-vision models only tolerate small image variations compared to human
title_full_unstemmed	Feedforward object-vision models only tolerate small image variations compared to human
title_short	Feedforward object-vision models only tolerate small image variations compared to human
title_sort	feedforward object-vision models only tolerate small image variations compared to human
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4103258/ https://www.ncbi.nlm.nih.gov/pubmed/25100986 http://dx.doi.org/10.3389/fncom.2014.00074
work_keys_str_mv	AT ghodratimasoud feedforwardobjectvisionmodelsonlytoleratesmallimagevariationscomparedtohuman AT farzmahdiamirhossein feedforwardobjectvisionmodelsonlytoleratesmallimagevariationscomparedtohuman AT rajaeikarim feedforwardobjectvisionmodelsonlytoleratesmallimagevariationscomparedtohuman AT ebrahimpourreza feedforwardobjectvisionmodelsonlytoleratesmallimagevariationscomparedtohuman AT khalighrazaviseyedmahdi feedforwardobjectvisionmodelsonlytoleratesmallimagevariationscomparedtohuman

Feedforward object-vision models only tolerate small image variations compared to human

Ejemplares similares