Cargando…
Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing
Natural language provides an intuitive and effective interaction interface between human beings and robots. Currently, multiple approaches are presented to address natural language visual grounding for human-robot interaction. However, most of the existing approaches handle the ambiguity of natural...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331387/ https://www.ncbi.nlm.nih.gov/pubmed/32670046 http://dx.doi.org/10.3389/fnbot.2020.00043 |
_version_ | 1783553318047449088 |
---|---|
author | Mi, Jinpeng Lyu, Jianzhi Tang, Song Li, Qingdu Zhang, Jianwei |
author_facet | Mi, Jinpeng Lyu, Jianzhi Tang, Song Li, Qingdu Zhang, Jianwei |
author_sort | Mi, Jinpeng |
collection | PubMed |
description | Natural language provides an intuitive and effective interaction interface between human beings and robots. Currently, multiple approaches are presented to address natural language visual grounding for human-robot interaction. However, most of the existing approaches handle the ambiguity of natural language queries and achieve target objects grounding via dialogue systems, which make the interactions cumbersome and time-consuming. In contrast, we address interactive natural language grounding without auxiliary information. Specifically, we first propose a referring expression comprehension network to ground natural referring expressions. The referring expression comprehension network excavates the visual semantics via a visual semantic-aware network, and exploits the rich linguistic contexts in expressions by a language attention network. Furthermore, we combine the referring expression comprehension network with scene graph parsing to achieve unrestricted and complicated natural language grounding. Finally, we validate the performance of the referring expression comprehension network on three public datasets, and we also evaluate the effectiveness of the interactive natural language grounding architecture by conducting extensive natural language query groundings in different household scenarios. |
format | Online Article Text |
id | pubmed-7331387 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-73313872020-07-14 Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing Mi, Jinpeng Lyu, Jianzhi Tang, Song Li, Qingdu Zhang, Jianwei Front Neurorobot Neuroscience Natural language provides an intuitive and effective interaction interface between human beings and robots. Currently, multiple approaches are presented to address natural language visual grounding for human-robot interaction. However, most of the existing approaches handle the ambiguity of natural language queries and achieve target objects grounding via dialogue systems, which make the interactions cumbersome and time-consuming. In contrast, we address interactive natural language grounding without auxiliary information. Specifically, we first propose a referring expression comprehension network to ground natural referring expressions. The referring expression comprehension network excavates the visual semantics via a visual semantic-aware network, and exploits the rich linguistic contexts in expressions by a language attention network. Furthermore, we combine the referring expression comprehension network with scene graph parsing to achieve unrestricted and complicated natural language grounding. Finally, we validate the performance of the referring expression comprehension network on three public datasets, and we also evaluate the effectiveness of the interactive natural language grounding architecture by conducting extensive natural language query groundings in different household scenarios. Frontiers Media S.A. 2020-06-25 /pmc/articles/PMC7331387/ /pubmed/32670046 http://dx.doi.org/10.3389/fnbot.2020.00043 Text en Copyright © 2020 Mi, Lyu, Tang, Li and Zhang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Mi, Jinpeng Lyu, Jianzhi Tang, Song Li, Qingdu Zhang, Jianwei Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing |
title | Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing |
title_full | Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing |
title_fullStr | Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing |
title_full_unstemmed | Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing |
title_short | Interactive Natural Language Grounding via Referring Expression Comprehension and Scene Graph Parsing |
title_sort | interactive natural language grounding via referring expression comprehension and scene graph parsing |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7331387/ https://www.ncbi.nlm.nih.gov/pubmed/32670046 http://dx.doi.org/10.3389/fnbot.2020.00043 |
work_keys_str_mv | AT mijinpeng interactivenaturallanguagegroundingviareferringexpressioncomprehensionandscenegraphparsing AT lyujianzhi interactivenaturallanguagegroundingviareferringexpressioncomprehensionandscenegraphparsing AT tangsong interactivenaturallanguagegroundingviareferringexpressioncomprehensionandscenegraphparsing AT liqingdu interactivenaturallanguagegroundingviareferringexpressioncomprehensionandscenegraphparsing AT zhangjianwei interactivenaturallanguagegroundingviareferringexpressioncomprehensionandscenegraphparsing |