Cargando…
Can You Tell Me where Wally Is?
Referring expression generation can be thought of as the converse problem to visual search: given a scene and a target, the participant's task is to generate a description which would allow somebody else to quickly and accurately locate the target. While this problem has been studied in psychol...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5393640/ http://dx.doi.org/10.1068/ig6 |
_version_ | 1783229590326476800 |
---|---|
author | Clarke, A D F Elsner, M Rohde, H |
author_facet | Clarke, A D F Elsner, M Rohde, H |
author_sort | Clarke, A D F |
collection | PubMed |
description | Referring expression generation can be thought of as the converse problem to visual search: given a scene and a target, the participant's task is to generate a description which would allow somebody else to quickly and accurately locate the target. While this problem has been studied in psycholinguistics and natural language processing, we believe that vision science also has a role to play. In particular, previous work on this problem is based on simple scenes consisting of a small number of objects and treats vision almost as a pre-process that extracts feature categories for each object in the scene. However, it is unlikely these models will scale: we know from the visual search literature that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. We hypothesize speakers will be sensitive to visual features allowing them to compose such ‘good’ descriptions. In the present study, we investigate how visual properties (salience, clutter, area and distance) influence REG using images from the “Where's Wally?” books [Handford 1987], which are an order of magnitude more complex than the stimuli traditionally used in REG experiments. We find that referring expressions for large salient targets are shorter than those for smaller and less salient targets. and that targets within highly cluttered scenes are described using more words. The choice of spatial relations also appears to be influenced by visual properties as participants show a preference for referencing large, salient landmarks that are in close proximity to the target. |
format | Online Article Text |
id | pubmed-5393640 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-53936402017-04-24 Can You Tell Me where Wally Is? Clarke, A D F Elsner, M Rohde, H Iperception Article Referring expression generation can be thought of as the converse problem to visual search: given a scene and a target, the participant's task is to generate a description which would allow somebody else to quickly and accurately locate the target. While this problem has been studied in psycholinguistics and natural language processing, we believe that vision science also has a role to play. In particular, previous work on this problem is based on simple scenes consisting of a small number of objects and treats vision almost as a pre-process that extracts feature categories for each object in the scene. However, it is unlikely these models will scale: we know from the visual search literature that some descriptions are better than others at enabling listeners to search efficiently within complex stimuli. We hypothesize speakers will be sensitive to visual features allowing them to compose such ‘good’ descriptions. In the present study, we investigate how visual properties (salience, clutter, area and distance) influence REG using images from the “Where's Wally?” books [Handford 1987], which are an order of magnitude more complex than the stimuli traditionally used in REG experiments. We find that referring expressions for large salient targets are shorter than those for smaller and less salient targets. and that targets within highly cluttered scenes are described using more words. The choice of spatial relations also appears to be influenced by visual properties as participants show a preference for referencing large, salient landmarks that are in close proximity to the target. SAGE Publications 2013-10-01 2013-10 /pmc/articles/PMC5393640/ http://dx.doi.org/10.1068/ig6 Text en © 2013 SAGE Publications Ltd. Manuscript content on this site is licensed under Creative Commons Licenses http://creativecommons.org/licenses/by/3.0/ This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (http://www.uk.sagepub.com/aboutus/openaccess.htm). |
spellingShingle | Article Clarke, A D F Elsner, M Rohde, H Can You Tell Me where Wally Is? |
title | Can You Tell Me where Wally Is? |
title_full | Can You Tell Me where Wally Is? |
title_fullStr | Can You Tell Me where Wally Is? |
title_full_unstemmed | Can You Tell Me where Wally Is? |
title_short | Can You Tell Me where Wally Is? |
title_sort | can you tell me where wally is? |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5393640/ http://dx.doi.org/10.1068/ig6 |
work_keys_str_mv | AT clarkeadf canyoutellmewherewallyis AT elsnerm canyoutellmewherewallyis AT rohdeh canyoutellmewherewallyis |