Cargando…

Object recognition combining vision and touch

This paper explores ways of combining vision and touch for the purpose of object recognition. In particular, it focuses on scenarios when there are few tactile training samples (as these are usually costly to obtain) and when vision is artificially impaired. Whilst machine vision is a widely studied...

Descripción completa

Detalles Bibliográficos
Autores principales:	Corradi, Tadeo, Hall, Peter, Iravani, Pejman
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Berlin Heidelberg 2017
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395591/ https://www.ncbi.nlm.nih.gov/pubmed/28480157 http://dx.doi.org/10.1186/s40638-017-0058-2

_version_	1783229894432391168
author	Corradi, Tadeo Hall, Peter Iravani, Pejman
author_facet	Corradi, Tadeo Hall, Peter Iravani, Pejman
author_sort	Corradi, Tadeo
collection	PubMed
description	This paper explores ways of combining vision and touch for the purpose of object recognition. In particular, it focuses on scenarios when there are few tactile training samples (as these are usually costly to obtain) and when vision is artificially impaired. Whilst machine vision is a widely studied field, and machine touch has received some attention recently, the fusion of both modalities remains a relatively unexplored area. It has been suggested that, in the human brain, there exist shared multi-sensorial representations of objects. This provides robustness when one or more senses are absent or unreliable. Modern robotics systems can benefit from multi-sensorial input, in particular in contexts where one or more of the sensors perform poorly. In this paper, a recently proposed tactile recognition model was extended by integrating a simple vision system in three different ways: vector concatenation (vision feature vector and tactile feature vector), object label posterior averaging and object label posterior product. A comparison is drawn in terms of overall accuracy of recognition and in terms of how quickly (number of training samples) learning occurs. The conclusions reached are: (1) the most accurate system is “posterior product”, (2) multi-modal recognition has higher accuracy to either modality alone if all visual and tactile training data are pooled together, and (3) in the case of visual impairment, multi-modal recognition “learns faster”, i.e. requires fewer training samples to achieve the same accuracy as either other modality.
format	Online Article Text
id	pubmed-5395591
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Springer Berlin Heidelberg
record_format	MEDLINE/PubMed
spelling	pubmed-53955912017-05-04 Object recognition combining vision and touch Corradi, Tadeo Hall, Peter Iravani, Pejman Robotics Biomim Research This paper explores ways of combining vision and touch for the purpose of object recognition. In particular, it focuses on scenarios when there are few tactile training samples (as these are usually costly to obtain) and when vision is artificially impaired. Whilst machine vision is a widely studied field, and machine touch has received some attention recently, the fusion of both modalities remains a relatively unexplored area. It has been suggested that, in the human brain, there exist shared multi-sensorial representations of objects. This provides robustness when one or more senses are absent or unreliable. Modern robotics systems can benefit from multi-sensorial input, in particular in contexts where one or more of the sensors perform poorly. In this paper, a recently proposed tactile recognition model was extended by integrating a simple vision system in three different ways: vector concatenation (vision feature vector and tactile feature vector), object label posterior averaging and object label posterior product. A comparison is drawn in terms of overall accuracy of recognition and in terms of how quickly (number of training samples) learning occurs. The conclusions reached are: (1) the most accurate system is “posterior product”, (2) multi-modal recognition has higher accuracy to either modality alone if all visual and tactile training data are pooled together, and (3) in the case of visual impairment, multi-modal recognition “learns faster”, i.e. requires fewer training samples to achieve the same accuracy as either other modality. Springer Berlin Heidelberg 2017-04-18 2017 /pmc/articles/PMC5395591/ /pubmed/28480157 http://dx.doi.org/10.1186/s40638-017-0058-2 Text en © The Author(s) 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Research Corradi, Tadeo Hall, Peter Iravani, Pejman Object recognition combining vision and touch
title	Object recognition combining vision and touch
title_full	Object recognition combining vision and touch
title_fullStr	Object recognition combining vision and touch
title_full_unstemmed	Object recognition combining vision and touch
title_short	Object recognition combining vision and touch
title_sort	object recognition combining vision and touch
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5395591/ https://www.ncbi.nlm.nih.gov/pubmed/28480157 http://dx.doi.org/10.1186/s40638-017-0058-2
work_keys_str_mv	AT corraditadeo objectrecognitioncombiningvisionandtouch AT hallpeter objectrecognitioncombiningvisionandtouch AT iravanipejman objectrecognitioncombiningvisionandtouch

Object recognition combining vision and touch

Ejemplares similares