Cargando…
The Functional Correspondence Problem
The ability to find correspondences in visual data is the essence of most computer vision tasks. But what are the right correspondences? The task of visual correspondence is well defined for two different images of same object instance. In case of two images of objects belonging to same category, vi...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Cornell University
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8475963/ https://www.ncbi.nlm.nih.gov/pubmed/34580645 |
_version_ | 1784575501548912640 |
---|---|
author | Lai, Zihang Purushwalkam, Senthil Gupta, Abhinav |
author_facet | Lai, Zihang Purushwalkam, Senthil Gupta, Abhinav |
author_sort | Lai, Zihang |
collection | PubMed |
description | The ability to find correspondences in visual data is the essence of most computer vision tasks. But what are the right correspondences? The task of visual correspondence is well defined for two different images of same object instance. In case of two images of objects belonging to same category, visual correspondence is reasonably well-defined in most cases. But what about correspondence between two objects of completely different category – e.g., a shoe and a bottle? Does there exist any correspondence? Inspired by humans’ ability to: (a) generalize beyond semantic categories and; (b) infer functional affordances, we introduce the problem of functional correspondences in this paper. Given images of two objects, we ask a simple question: what is the set of correspondences between these two images for a given task? For example, what are the correspondences between a bottle and shoe for the task of pounding or the task of pouring. We introduce a new dataset: FunKPoint that has ground truth correspondences for 10 tasks and 20 object categories. We also introduce a modular task-driven representation for attacking this problem and demonstrate that our learned representation is effective for this task. But most importantly, because our supervision signal is not bound by semantics, we show that our learned representation can generalize better on few-shot classification problem. We hope this paper will inspire our community to think beyond semantics and focus more on cross-category generalization and learning representations for robotics tasks. |
format | Online Article Text |
id | pubmed-8475963 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Cornell University |
record_format | MEDLINE/PubMed |
spelling | pubmed-84759632021-09-28 The Functional Correspondence Problem Lai, Zihang Purushwalkam, Senthil Gupta, Abhinav ArXiv Article The ability to find correspondences in visual data is the essence of most computer vision tasks. But what are the right correspondences? The task of visual correspondence is well defined for two different images of same object instance. In case of two images of objects belonging to same category, visual correspondence is reasonably well-defined in most cases. But what about correspondence between two objects of completely different category – e.g., a shoe and a bottle? Does there exist any correspondence? Inspired by humans’ ability to: (a) generalize beyond semantic categories and; (b) infer functional affordances, we introduce the problem of functional correspondences in this paper. Given images of two objects, we ask a simple question: what is the set of correspondences between these two images for a given task? For example, what are the correspondences between a bottle and shoe for the task of pounding or the task of pouring. We introduce a new dataset: FunKPoint that has ground truth correspondences for 10 tasks and 20 object categories. We also introduce a modular task-driven representation for attacking this problem and demonstrate that our learned representation is effective for this task. But most importantly, because our supervision signal is not bound by semantics, we show that our learned representation can generalize better on few-shot classification problem. We hope this paper will inspire our community to think beyond semantics and focus more on cross-category generalization and learning representations for robotics tasks. Cornell University 2021-09-02 /pmc/articles/PMC8475963/ /pubmed/34580645 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under aCreative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/) , which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. |
spellingShingle | Article Lai, Zihang Purushwalkam, Senthil Gupta, Abhinav The Functional Correspondence Problem |
title | The Functional Correspondence Problem |
title_full | The Functional Correspondence Problem |
title_fullStr | The Functional Correspondence Problem |
title_full_unstemmed | The Functional Correspondence Problem |
title_short | The Functional Correspondence Problem |
title_sort | functional correspondence problem |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8475963/ https://www.ncbi.nlm.nih.gov/pubmed/34580645 |
work_keys_str_mv | AT laizihang thefunctionalcorrespondenceproblem AT purushwalkamsenthil thefunctionalcorrespondenceproblem AT guptaabhinav thefunctionalcorrespondenceproblem AT laizihang functionalcorrespondenceproblem AT purushwalkamsenthil functionalcorrespondenceproblem AT guptaabhinav functionalcorrespondenceproblem |