Cargando…

Can You Tell Me where Wally Is?

Referring expression generation can be thought of as the converse problem to visual search: given a scene and a target, the participant's task is to generate a description which would allow somebody else to quickly and accurately locate the target. While this problem has been studied in psychol...

Descripción completa

Detalles Bibliográficos
Autores principales:	Clarke, A D F, Elsner, M, Rohde, H
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	SAGE Publications 2013
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5393640/ http://dx.doi.org/10.1068/ig6

_version_	1783229590326476800
author	Clarke, A D F Elsner, M Rohde, H
author_facet	Clarke, A D F Elsner, M Rohde, H
author_sort	Clarke, A D F
collection	PubMed
description	Referring expression generation can be thought of as the converse problem to visual search: given a scene and a target, the participant's task is to generate a description which would allow somebody else to quickly and accurately locate the target. While this problem has been studied in psycholinguistics and natural language processing, we believe that vision science also has a role to play. In particular, previous work on this problem is based on simple scenes consisting of a small number of objects and treats vision almost as a pre-process that extracts feature categories for each object in the scene. However, it is unlikely these models will scale: we know from the visual search literature that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. We hypothesize speakers will be sensitive to visual features allowing them to compose such ‘good’ descriptions. In the present study, we investigate how visual properties (salience, clutter, area and distance) influence REG using images from the “Where's Wally?” books [Handford 1987], which are an order of magnitude more complex than the stimuli traditionally used in REG experiments. We find that referring expressions for large salient targets are shorter than those for smaller and less salient targets. and that targets within highly cluttered scenes are described using more words. The choice of spatial relations also appears to be influenced by visual properties as participants show a preference for referencing large, salient landmarks that are in close proximity to the target.
format	Online Article Text
id	pubmed-5393640
institution	National Center for Biotechnology Information
language	English
publishDate	2013
publisher	SAGE Publications
record_format	MEDLINE/PubMed
spelling	pubmed-53936402017-04-24 Can You Tell Me where Wally Is? Clarke, A D F Elsner, M Rohde, H Iperception Article Referring expression generation can be thought of as the converse problem to visual search: given a scene and a target, the participant's task is to generate a description which would allow somebody else to quickly and accurately locate the target. While this problem has been studied in psycholinguistics and natural language processing, we believe that vision science also has a role to play. In particular, previous work on this problem is based on simple scenes consisting of a small number of objects and treats vision almost as a pre-process that extracts feature categories for each object in the scene. However, it is unlikely these models will scale: we know from the visual search literature that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. We hypothesize speakers will be sensitive to visual features allowing them to compose such ‘good’ descriptions. In the present study, we investigate how visual properties (salience, clutter, area and distance) influence REG using images from the “Where's Wally?” books [Handford 1987], which are an order of magnitude more complex than the stimuli traditionally used in REG experiments. We find that referring expressions for large salient targets are shorter than those for smaller and less salient targets. and that targets within highly cluttered scenes are described using more words. The choice of spatial relations also appears to be influenced by visual properties as participants show a preference for referencing large, salient landmarks that are in close proximity to the target. SAGE Publications 2013-10-01 2013-10 /pmc/articles/PMC5393640/ http://dx.doi.org/10.1068/ig6 Text en © 2013 SAGE Publications Ltd. Manuscript content on this site is licensed under Creative Commons Licenses http://creativecommons.org/licenses/by/3.0/ This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (http://www.uk.sagepub.com/aboutus/openaccess.htm).
spellingShingle	Article Clarke, A D F Elsner, M Rohde, H Can You Tell Me where Wally Is?
title	Can You Tell Me where Wally Is?
title_full	Can You Tell Me where Wally Is?
title_fullStr	Can You Tell Me where Wally Is?
title_full_unstemmed	Can You Tell Me where Wally Is?
title_short	Can You Tell Me where Wally Is?
title_sort	can you tell me where wally is?
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5393640/ http://dx.doi.org/10.1068/ig6
work_keys_str_mv	AT clarkeadf canyoutellmewherewallyis AT elsnerm canyoutellmewherewallyis AT rohdeh canyoutellmewherewallyis

Can You Tell Me where Wally Is?

Ejemplares similares