Cargando…

Active Vision in Binocular Depth Estimation: A Top-Down Perspective

Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. Ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Priorelli, Matteo, Pezzulo, Giovanni, Stoianov, Ivilin Peev
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10526497/
https://www.ncbi.nlm.nih.gov/pubmed/37754196
http://dx.doi.org/10.3390/biomimetics8050445
_version_ 1785111036258418688
author Priorelli, Matteo
Pezzulo, Giovanni
Stoianov, Ivilin Peev
author_facet Priorelli, Matteo
Pezzulo, Giovanni
Stoianov, Ivilin Peev
author_sort Priorelli, Matteo
collection PubMed
description Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes’ projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action–perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits.
format Online
Article
Text
id pubmed-10526497
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105264972023-09-28 Active Vision in Binocular Depth Estimation: A Top-Down Perspective Priorelli, Matteo Pezzulo, Giovanni Stoianov, Ivilin Peev Biomimetics (Basel) Article Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes’ projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action–perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits. MDPI 2023-09-21 /pmc/articles/PMC10526497/ /pubmed/37754196 http://dx.doi.org/10.3390/biomimetics8050445 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Priorelli, Matteo
Pezzulo, Giovanni
Stoianov, Ivilin Peev
Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_full Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_fullStr Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_full_unstemmed Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_short Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_sort active vision in binocular depth estimation: a top-down perspective
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10526497/
https://www.ncbi.nlm.nih.gov/pubmed/37754196
http://dx.doi.org/10.3390/biomimetics8050445
work_keys_str_mv AT priorellimatteo activevisioninbinoculardepthestimationatopdownperspective
AT pezzulogiovanni activevisioninbinoculardepthestimationatopdownperspective
AT stoianovivilinpeev activevisioninbinoculardepthestimationatopdownperspective