Cargando…

Viewpoint planning with transition management for active object recognition

Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the n...

Descripción completa

Detalles Bibliográficos
Autores principales: Sun, Haibo, Zhu, Feng, Li, Yangyang, Zhao, Pengfei, Kong, Yanzi, Wang, Jianyu, Wan, Yingcai, Fu, Shuangfei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9998679/
https://www.ncbi.nlm.nih.gov/pubmed/36910268
http://dx.doi.org/10.3389/fnbot.2023.1093132
_version_ 1784903517825138688
author Sun, Haibo
Zhu, Feng
Li, Yangyang
Zhao, Pengfei
Kong, Yanzi
Wang, Jianyu
Wan, Yingcai
Fu, Shuangfei
author_facet Sun, Haibo
Zhu, Feng
Li, Yangyang
Zhao, Pengfei
Kong, Yanzi
Wang, Jianyu
Wan, Yingcai
Fu, Shuangfei
author_sort Sun, Haibo
collection PubMed
description Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches.
format Online
Article
Text
id pubmed-9998679
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99986792023-03-11 Viewpoint planning with transition management for active object recognition Sun, Haibo Zhu, Feng Li, Yangyang Zhao, Pengfei Kong, Yanzi Wang, Jianyu Wan, Yingcai Fu, Shuangfei Front Neurorobot Neuroscience Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches. Frontiers Media S.A. 2023-02-24 /pmc/articles/PMC9998679/ /pubmed/36910268 http://dx.doi.org/10.3389/fnbot.2023.1093132 Text en Copyright © 2023 Sun, Zhu, Li, Zhao, Kong, Wang, Wan and Fu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Sun, Haibo
Zhu, Feng
Li, Yangyang
Zhao, Pengfei
Kong, Yanzi
Wang, Jianyu
Wan, Yingcai
Fu, Shuangfei
Viewpoint planning with transition management for active object recognition
title Viewpoint planning with transition management for active object recognition
title_full Viewpoint planning with transition management for active object recognition
title_fullStr Viewpoint planning with transition management for active object recognition
title_full_unstemmed Viewpoint planning with transition management for active object recognition
title_short Viewpoint planning with transition management for active object recognition
title_sort viewpoint planning with transition management for active object recognition
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9998679/
https://www.ncbi.nlm.nih.gov/pubmed/36910268
http://dx.doi.org/10.3389/fnbot.2023.1093132
work_keys_str_mv AT sunhaibo viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT zhufeng viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT liyangyang viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT zhaopengfei viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT kongyanzi viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT wangjianyu viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT wanyingcai viewpointplanningwithtransitionmanagementforactiveobjectrecognition
AT fushuangfei viewpointplanningwithtransitionmanagementforactiveobjectrecognition