Cargando…
Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback
In order to deploy robots that could be adapted by non-expert users, interactive imitation learning (IIL) methods must be flexible regarding the interaction preferences of the teacher and avoid assumptions of perfect teachers (oracles), while considering they make mistakes influenced by diverse huma...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer London
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10338625/ https://www.ncbi.nlm.nih.gov/pubmed/37455835 http://dx.doi.org/10.1007/s00521-022-08118-z |
_version_ | 1785071667827965952 |
---|---|
author | Celemin, Carlos Kober, Jens |
author_facet | Celemin, Carlos Kober, Jens |
author_sort | Celemin, Carlos |
collection | PubMed |
description | In order to deploy robots that could be adapted by non-expert users, interactive imitation learning (IIL) methods must be flexible regarding the interaction preferences of the teacher and avoid assumptions of perfect teachers (oracles), while considering they make mistakes influenced by diverse human factors. In this work, we propose an IIL method that improves the human–robot interaction for non-expert and imperfect teachers in two directions. First, uncertainty estimation is included to endow the agents with a lack of knowledge awareness (epistemic uncertainty) and demonstration ambiguity awareness (aleatoric uncertainty), such that the robot can request human input when it is deemed more necessary. Second, the proposed method enables the teachers to train with the flexibility of using corrective demonstrations, evaluative reinforcements, and implicit positive feedback. The experimental results show an improvement in learning convergence with respect to other learning methods when the agent learns from highly ambiguous teachers. Additionally, in a user study, it was found that the components of the proposed method improve the teaching experience and the data efficiency of the learning process. |
format | Online Article Text |
id | pubmed-10338625 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer London |
record_format | MEDLINE/PubMed |
spelling | pubmed-103386252023-07-14 Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback Celemin, Carlos Kober, Jens Neural Comput Appl S.I.: Human-aligned Reinforcement Learning for Autonomous Agents and Robots In order to deploy robots that could be adapted by non-expert users, interactive imitation learning (IIL) methods must be flexible regarding the interaction preferences of the teacher and avoid assumptions of perfect teachers (oracles), while considering they make mistakes influenced by diverse human factors. In this work, we propose an IIL method that improves the human–robot interaction for non-expert and imperfect teachers in two directions. First, uncertainty estimation is included to endow the agents with a lack of knowledge awareness (epistemic uncertainty) and demonstration ambiguity awareness (aleatoric uncertainty), such that the robot can request human input when it is deemed more necessary. Second, the proposed method enables the teachers to train with the flexibility of using corrective demonstrations, evaluative reinforcements, and implicit positive feedback. The experimental results show an improvement in learning convergence with respect to other learning methods when the agent learns from highly ambiguous teachers. Additionally, in a user study, it was found that the components of the proposed method improve the teaching experience and the data efficiency of the learning process. Springer London 2023-01-16 2023 /pmc/articles/PMC10338625/ /pubmed/37455835 http://dx.doi.org/10.1007/s00521-022-08118-z Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | S.I.: Human-aligned Reinforcement Learning for Autonomous Agents and Robots Celemin, Carlos Kober, Jens Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
title | Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
title_full | Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
title_fullStr | Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
title_full_unstemmed | Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
title_short | Knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
title_sort | knowledge- and ambiguity-aware robot learning from corrective and evaluative feedback |
topic | S.I.: Human-aligned Reinforcement Learning for Autonomous Agents and Robots |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10338625/ https://www.ncbi.nlm.nih.gov/pubmed/37455835 http://dx.doi.org/10.1007/s00521-022-08118-z |
work_keys_str_mv | AT celemincarlos knowledgeandambiguityawarerobotlearningfromcorrectiveandevaluativefeedback AT koberjens knowledgeandambiguityawarerobotlearningfromcorrectiveandevaluativefeedback |