Cargando…

Audio-Visual Perception System for a Humanoid Robotic Head

One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most o...

Descripción completa

Detalles Bibliográficos
Autores principales: Viciana-Abad, Raquel, Marfil, Rebeca, Perez-Lorenzo, Jose M., Bandera, Juan P., Romero-Garces, Adrian, Reche-Lopez, Pedro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4118331/
https://www.ncbi.nlm.nih.gov/pubmed/24878593
http://dx.doi.org/10.3390/s140609522
_version_ 1782328825730826240
author Viciana-Abad, Raquel
Marfil, Rebeca
Perez-Lorenzo, Jose M.
Bandera, Juan P.
Romero-Garces, Adrian
Reche-Lopez, Pedro
author_facet Viciana-Abad, Raquel
Marfil, Rebeca
Perez-Lorenzo, Jose M.
Bandera, Juan P.
Romero-Garces, Adrian
Reche-Lopez, Pedro
author_sort Viciana-Abad, Raquel
collection PubMed
description One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.
format Online
Article
Text
id pubmed-4118331
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-41183312014-08-01 Audio-Visual Perception System for a Humanoid Robotic Head Viciana-Abad, Raquel Marfil, Rebeca Perez-Lorenzo, Jose M. Bandera, Juan P. Romero-Garces, Adrian Reche-Lopez, Pedro Sensors (Basel) Article One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework. MDPI 2014-05-28 /pmc/articles/PMC4118331/ /pubmed/24878593 http://dx.doi.org/10.3390/s140609522 Text en © 2014 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Article
Viciana-Abad, Raquel
Marfil, Rebeca
Perez-Lorenzo, Jose M.
Bandera, Juan P.
Romero-Garces, Adrian
Reche-Lopez, Pedro
Audio-Visual Perception System for a Humanoid Robotic Head
title Audio-Visual Perception System for a Humanoid Robotic Head
title_full Audio-Visual Perception System for a Humanoid Robotic Head
title_fullStr Audio-Visual Perception System for a Humanoid Robotic Head
title_full_unstemmed Audio-Visual Perception System for a Humanoid Robotic Head
title_short Audio-Visual Perception System for a Humanoid Robotic Head
title_sort audio-visual perception system for a humanoid robotic head
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4118331/
https://www.ncbi.nlm.nih.gov/pubmed/24878593
http://dx.doi.org/10.3390/s140609522
work_keys_str_mv AT vicianaabadraquel audiovisualperceptionsystemforahumanoidrobotichead
AT marfilrebeca audiovisualperceptionsystemforahumanoidrobotichead
AT perezlorenzojosem audiovisualperceptionsystemforahumanoidrobotichead
AT banderajuanp audiovisualperceptionsystemforahumanoidrobotichead
AT romerogarcesadrian audiovisualperceptionsystemforahumanoidrobotichead
AT rechelopezpedro audiovisualperceptionsystemforahumanoidrobotichead