Cargando…

Speech Recognition for the iCub Platform

This paper describes open source software (available at https://github.com/robotology/natural-speech) to build automatic speech recognition (ASR) systems and run them within the YARP platform. The toolkit is designed (i) to allow non-ASR experts to easily create their own ASR system and run it on iC...

Descripción completa

Detalles Bibliográficos
Autores principales:	Higy, Bertrand, Mereta, Alessio, Metta, Giorgio, Badino, Leonardo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2018
Materias:	Robotics and AI
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805979/ https://www.ncbi.nlm.nih.gov/pubmed/33500897 http://dx.doi.org/10.3389/frobt.2018.00010

_version_	1783636426933403648
author	Higy, Bertrand Mereta, Alessio Metta, Giorgio Badino, Leonardo
author_facet	Higy, Bertrand Mereta, Alessio Metta, Giorgio Badino, Leonardo
author_sort	Higy, Bertrand
collection	PubMed
description	This paper describes open source software (available at https://github.com/robotology/natural-speech) to build automatic speech recognition (ASR) systems and run them within the YARP platform. The toolkit is designed (i) to allow non-ASR experts to easily create their own ASR system and run it on iCub and (ii) to build deep learning-based models specifically addressing the main challenges an ASR system faces in the context of verbal human–iCub interactions. The toolkit mostly consists of Python, C++ code and shell scripts integrated in YARP. As additional contribution, a second codebase (written in Matlab) is provided for more expert ASR users who want to experiment with bio-inspired and developmental learning-inspired ASR systems. Specifically, we provide code for two distinct kinds of speech recognition: “articulatory” and “unsupervised” speech recognition. The first is largely inspired by influential neurobiological theories of speech perception which assume speech perception to be mediated by brain motor cortex activities. Our articulatory systems have been shown to outperform strong deep learning-based baselines. The second type of recognition systems, the “unsupervised” systems, do not use any supervised information (contrary to most ASR systems, including our articulatory systems). To some extent, they mimic an infant who has to discover the basic speech units of a language by herself. In addition, we provide resources consisting of pre-trained deep learning models for ASR, and a 2.5-h speech dataset of spoken commands, the VoCub dataset, which can be used to adapt an ASR system to the typical acoustic environments in which iCub operates.
format	Online Article Text
id	pubmed-7805979
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-78059792021-01-25 Speech Recognition for the iCub Platform Higy, Bertrand Mereta, Alessio Metta, Giorgio Badino, Leonardo Front Robot AI Robotics and AI This paper describes open source software (available at https://github.com/robotology/natural-speech) to build automatic speech recognition (ASR) systems and run them within the YARP platform. The toolkit is designed (i) to allow non-ASR experts to easily create their own ASR system and run it on iCub and (ii) to build deep learning-based models specifically addressing the main challenges an ASR system faces in the context of verbal human–iCub interactions. The toolkit mostly consists of Python, C++ code and shell scripts integrated in YARP. As additional contribution, a second codebase (written in Matlab) is provided for more expert ASR users who want to experiment with bio-inspired and developmental learning-inspired ASR systems. Specifically, we provide code for two distinct kinds of speech recognition: “articulatory” and “unsupervised” speech recognition. The first is largely inspired by influential neurobiological theories of speech perception which assume speech perception to be mediated by brain motor cortex activities. Our articulatory systems have been shown to outperform strong deep learning-based baselines. The second type of recognition systems, the “unsupervised” systems, do not use any supervised information (contrary to most ASR systems, including our articulatory systems). To some extent, they mimic an infant who has to discover the basic speech units of a language by herself. In addition, we provide resources consisting of pre-trained deep learning models for ASR, and a 2.5-h speech dataset of spoken commands, the VoCub dataset, which can be used to adapt an ASR system to the typical acoustic environments in which iCub operates. Frontiers Media S.A. 2018-02-12 /pmc/articles/PMC7805979/ /pubmed/33500897 http://dx.doi.org/10.3389/frobt.2018.00010 Text en Copyright © 2018 Higy, Mereta, Metta and Badino. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Robotics and AI Higy, Bertrand Mereta, Alessio Metta, Giorgio Badino, Leonardo Speech Recognition for the iCub Platform
title	Speech Recognition for the iCub Platform
title_full	Speech Recognition for the iCub Platform
title_fullStr	Speech Recognition for the iCub Platform
title_full_unstemmed	Speech Recognition for the iCub Platform
title_short	Speech Recognition for the iCub Platform
title_sort	speech recognition for the icub platform
topic	Robotics and AI
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7805979/ https://www.ncbi.nlm.nih.gov/pubmed/33500897 http://dx.doi.org/10.3389/frobt.2018.00010
work_keys_str_mv	AT higybertrand speechrecognitionfortheicubplatform AT meretaalessio speechrecognitionfortheicubplatform AT mettagiorgio speechrecognitionfortheicubplatform AT badinoleonardo speechrecognitionfortheicubplatform

Speech Recognition for the iCub Platform

Ejemplares similares