Cargando…
Active Vision in Binocular Depth Estimation: A Top-Down Perspective
Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. Ho...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10526497/ https://www.ncbi.nlm.nih.gov/pubmed/37754196 http://dx.doi.org/10.3390/biomimetics8050445 |
_version_ | 1785111036258418688 |
---|---|
author | Priorelli, Matteo Pezzulo, Giovanni Stoianov, Ivilin Peev |
author_facet | Priorelli, Matteo Pezzulo, Giovanni Stoianov, Ivilin Peev |
author_sort | Priorelli, Matteo |
collection | PubMed |
description | Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes’ projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action–perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits. |
format | Online Article Text |
id | pubmed-10526497 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-105264972023-09-28 Active Vision in Binocular Depth Estimation: A Top-Down Perspective Priorelli, Matteo Pezzulo, Giovanni Stoianov, Ivilin Peev Biomimetics (Basel) Article Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes’ projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action–perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits. MDPI 2023-09-21 /pmc/articles/PMC10526497/ /pubmed/37754196 http://dx.doi.org/10.3390/biomimetics8050445 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Priorelli, Matteo Pezzulo, Giovanni Stoianov, Ivilin Peev Active Vision in Binocular Depth Estimation: A Top-Down Perspective |
title | Active Vision in Binocular Depth Estimation: A Top-Down Perspective |
title_full | Active Vision in Binocular Depth Estimation: A Top-Down Perspective |
title_fullStr | Active Vision in Binocular Depth Estimation: A Top-Down Perspective |
title_full_unstemmed | Active Vision in Binocular Depth Estimation: A Top-Down Perspective |
title_short | Active Vision in Binocular Depth Estimation: A Top-Down Perspective |
title_sort | active vision in binocular depth estimation: a top-down perspective |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10526497/ https://www.ncbi.nlm.nih.gov/pubmed/37754196 http://dx.doi.org/10.3390/biomimetics8050445 |
work_keys_str_mv | AT priorellimatteo activevisioninbinoculardepthestimationatopdownperspective AT pezzulogiovanni activevisioninbinoculardepthestimationatopdownperspective AT stoianovivilinpeev activevisioninbinoculardepthestimationatopdownperspective |