Cargando…

Understanding 3D vision as a policy network

It is often assumed that the brain builds 3D coordinate frames, in retinal coordinates (with binocular disparity giving the third dimension), head-centred, body-centred and world-centred coordinates. This paper questions that assumption and begins to sketch an alternative based on, essentially, a se...

Descripción completa

Detalles Bibliográficos
Autor principal:	Glennerster, Andrew
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	The Royal Society 2023
Materias:	Articles
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9745881/ https://www.ncbi.nlm.nih.gov/pubmed/36511403 http://dx.doi.org/10.1098/rstb.2021.0448

_version_	1784849245871800320
author	Glennerster, Andrew
author_facet	Glennerster, Andrew
author_sort	Glennerster, Andrew
collection	PubMed
description	It is often assumed that the brain builds 3D coordinate frames, in retinal coordinates (with binocular disparity giving the third dimension), head-centred, body-centred and world-centred coordinates. This paper questions that assumption and begins to sketch an alternative based on, essentially, a set of reflexes. A ‘policy network’ is a term used in reinforcement learning to describe the set of actions that are generated by an agent depending on its current state. This is an untypical starting point for describing 3D vision, but a policy network can serve as a useful representation both for the 3D layout of a scene and the location of the observer within it. It avoids 3D reconstruction of the type used in computer vision but is similar to recent representations for navigation generated through reinforcement learning. A policy network for saccades (pure rotations of the camera/eye) is a logical starting point for understanding (i) an ego-centric representation of space (e.g. Marr’s (Marr 1982 Vision: a computational investigation into the human representation and processing of visual information) 2 [Formula: see text]-D sketch) and (ii) a hierarchical, compositional representation for navigation. The potential neural implementation of policy networks is straightforward; a network with a large range of sensory and task-related inputs such as the cerebellum would be capable of implementing this input/output function. This is not the case for 3D coordinate transformations in the brain: no neurally implementable proposals have yet been put forward that could carry out a transformation of a visual scene from retinal to world-based coordinates. Hence, if the representation underlying 3D vision can be described as a policy network (in which the actions are either saccades or head translations), this would be a significant step towards a neurally plausible model of 3D vision. This article is part of the theme issue ‘New approaches to 3D vision’.
format	Online Article Text
id	pubmed-9745881
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	The Royal Society
record_format	MEDLINE/PubMed
spelling	pubmed-97458812022-12-15 Understanding 3D vision as a policy network Glennerster, Andrew Philos Trans R Soc Lond B Biol Sci Articles It is often assumed that the brain builds 3D coordinate frames, in retinal coordinates (with binocular disparity giving the third dimension), head-centred, body-centred and world-centred coordinates. This paper questions that assumption and begins to sketch an alternative based on, essentially, a set of reflexes. A ‘policy network’ is a term used in reinforcement learning to describe the set of actions that are generated by an agent depending on its current state. This is an untypical starting point for describing 3D vision, but a policy network can serve as a useful representation both for the 3D layout of a scene and the location of the observer within it. It avoids 3D reconstruction of the type used in computer vision but is similar to recent representations for navigation generated through reinforcement learning. A policy network for saccades (pure rotations of the camera/eye) is a logical starting point for understanding (i) an ego-centric representation of space (e.g. Marr’s (Marr 1982 Vision: a computational investigation into the human representation and processing of visual information) 2 [Formula: see text]-D sketch) and (ii) a hierarchical, compositional representation for navigation. The potential neural implementation of policy networks is straightforward; a network with a large range of sensory and task-related inputs such as the cerebellum would be capable of implementing this input/output function. This is not the case for 3D coordinate transformations in the brain: no neurally implementable proposals have yet been put forward that could carry out a transformation of a visual scene from retinal to world-based coordinates. Hence, if the representation underlying 3D vision can be described as a policy network (in which the actions are either saccades or head translations), this would be a significant step towards a neurally plausible model of 3D vision. This article is part of the theme issue ‘New approaches to 3D vision’. The Royal Society 2023-01-30 2022-12-13 /pmc/articles/PMC9745881/ /pubmed/36511403 http://dx.doi.org/10.1098/rstb.2021.0448 Text en © 2022 The Authors. https://creativecommons.org/licenses/by/4.0/Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, provided the original author and source are credited.
spellingShingle	Articles Glennerster, Andrew Understanding 3D vision as a policy network
title	Understanding 3D vision as a policy network
title_full	Understanding 3D vision as a policy network
title_fullStr	Understanding 3D vision as a policy network
title_full_unstemmed	Understanding 3D vision as a policy network
title_short	Understanding 3D vision as a policy network
title_sort	understanding 3d vision as a policy network
topic	Articles
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9745881/ https://www.ncbi.nlm.nih.gov/pubmed/36511403 http://dx.doi.org/10.1098/rstb.2021.0448
work_keys_str_mv	AT glennersterandrew understanding3dvisionasapolicynetwork

Understanding 3D vision as a policy network

Ejemplares similares