Cargando…

Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction

Similar to specific natural language instructions, intention-related natural language queries also play an essential role in our daily life communication. Inspired by the psychology term “affordance” and its applications in Human-Robot interaction, we propose an object affordance-based natural langu...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mi, Jinpeng, Liang, Hongzhuo, Katsakis, Nikolaos, Tang, Song, Li, Qingdu, Zhang, Changshui, Zhang, Jianwei
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238763/ https://www.ncbi.nlm.nih.gov/pubmed/32477091 http://dx.doi.org/10.3389/fnbot.2020.00026

_version_	1783536593802362880
author	Mi, Jinpeng Liang, Hongzhuo Katsakis, Nikolaos Tang, Song Li, Qingdu Zhang, Changshui Zhang, Jianwei
author_facet	Mi, Jinpeng Liang, Hongzhuo Katsakis, Nikolaos Tang, Song Li, Qingdu Zhang, Changshui Zhang, Jianwei
author_sort	Mi, Jinpeng
collection	PubMed
description	Similar to specific natural language instructions, intention-related natural language queries also play an essential role in our daily life communication. Inspired by the psychology term “affordance” and its applications in Human-Robot interaction, we propose an object affordance-based natural language visual grounding architecture to ground intention-related natural language queries. Formally, we first present an attention-based multi-visual features fusion network to detect object affordances from RGB images. While fusing deep visual features extracted from a pre-trained CNN model with deep texture features encoded by a deep texture encoding network, the presented object affordance detection network takes into account the interaction of the multi-visual features, and reserves the complementary nature of the different features by integrating attention weights learned from sparse representations of the multi-visual features. We train and validate the attention-based object affordance recognition network on a self-built dataset in which a large number of images originate from MSCOCO and ImageNet. Moreover, we introduce an intention semantic extraction module to extract intention semantics from intention-related natural language queries. Finally, we ground intention-related natural language queries by integrating the detected object affordances with the extracted intention semantics. We conduct extensive experiments to validate the performance of the object affordance detection network and the intention-related natural language queries grounding architecture.
format	Online Article Text
id	pubmed-7238763
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-72387632020-05-29 Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction Mi, Jinpeng Liang, Hongzhuo Katsakis, Nikolaos Tang, Song Li, Qingdu Zhang, Changshui Zhang, Jianwei Front Neurorobot Neuroscience Similar to specific natural language instructions, intention-related natural language queries also play an essential role in our daily life communication. Inspired by the psychology term “affordance” and its applications in Human-Robot interaction, we propose an object affordance-based natural language visual grounding architecture to ground intention-related natural language queries. Formally, we first present an attention-based multi-visual features fusion network to detect object affordances from RGB images. While fusing deep visual features extracted from a pre-trained CNN model with deep texture features encoded by a deep texture encoding network, the presented object affordance detection network takes into account the interaction of the multi-visual features, and reserves the complementary nature of the different features by integrating attention weights learned from sparse representations of the multi-visual features. We train and validate the attention-based object affordance recognition network on a self-built dataset in which a large number of images originate from MSCOCO and ImageNet. Moreover, we introduce an intention semantic extraction module to extract intention semantics from intention-related natural language queries. Finally, we ground intention-related natural language queries by integrating the detected object affordances with the extracted intention semantics. We conduct extensive experiments to validate the performance of the object affordance detection network and the intention-related natural language queries grounding architecture. Frontiers Media S.A. 2020-05-13 /pmc/articles/PMC7238763/ /pubmed/32477091 http://dx.doi.org/10.3389/fnbot.2020.00026 Text en Copyright © 2020 Mi, Liang, Katsakis, Tang, Li, Zhang and Zhang. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Mi, Jinpeng Liang, Hongzhuo Katsakis, Nikolaos Tang, Song Li, Qingdu Zhang, Changshui Zhang, Jianwei Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction
title	Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction
title_full	Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction
title_fullStr	Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction
title_full_unstemmed	Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction
title_short	Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction
title_sort	intention-related natural language grounding via object affordance detection and intention semantic extraction
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7238763/ https://www.ncbi.nlm.nih.gov/pubmed/32477091 http://dx.doi.org/10.3389/fnbot.2020.00026
work_keys_str_mv	AT mijinpeng intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction AT lianghongzhuo intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction AT katsakisnikolaos intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction AT tangsong intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction AT liqingdu intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction AT zhangchangshui intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction AT zhangjianwei intentionrelatednaturallanguagegroundingviaobjectaffordancedetectionandintentionsemanticextraction

Intention-Related Natural Language Grounding via Object Affordance Detection and Intention Semantic Extraction

Ejemplares similares