Cargando…
Viewpoint planning with transition management for active object recognition
Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the n...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9998679/ https://www.ncbi.nlm.nih.gov/pubmed/36910268 http://dx.doi.org/10.3389/fnbot.2023.1093132 |
_version_ | 1784903517825138688 |
---|---|
author | Sun, Haibo Zhu, Feng Li, Yangyang Zhao, Pengfei Kong, Yanzi Wang, Jianyu Wan, Yingcai Fu, Shuangfei |
author_facet | Sun, Haibo Zhu, Feng Li, Yangyang Zhao, Pengfei Kong, Yanzi Wang, Jianyu Wan, Yingcai Fu, Shuangfei |
author_sort | Sun, Haibo |
collection | PubMed |
description | Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches. |
format | Online Article Text |
id | pubmed-9998679 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-99986792023-03-11 Viewpoint planning with transition management for active object recognition Sun, Haibo Zhu, Feng Li, Yangyang Zhao, Pengfei Kong, Yanzi Wang, Jianyu Wan, Yingcai Fu, Shuangfei Front Neurorobot Neuroscience Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established via the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches. Frontiers Media S.A. 2023-02-24 /pmc/articles/PMC9998679/ /pubmed/36910268 http://dx.doi.org/10.3389/fnbot.2023.1093132 Text en Copyright © 2023 Sun, Zhu, Li, Zhao, Kong, Wang, Wan and Fu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Sun, Haibo Zhu, Feng Li, Yangyang Zhao, Pengfei Kong, Yanzi Wang, Jianyu Wan, Yingcai Fu, Shuangfei Viewpoint planning with transition management for active object recognition |
title | Viewpoint planning with transition management for active object recognition |
title_full | Viewpoint planning with transition management for active object recognition |
title_fullStr | Viewpoint planning with transition management for active object recognition |
title_full_unstemmed | Viewpoint planning with transition management for active object recognition |
title_short | Viewpoint planning with transition management for active object recognition |
title_sort | viewpoint planning with transition management for active object recognition |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9998679/ https://www.ncbi.nlm.nih.gov/pubmed/36910268 http://dx.doi.org/10.3389/fnbot.2023.1093132 |
work_keys_str_mv | AT sunhaibo viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT zhufeng viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT liyangyang viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT zhaopengfei viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT kongyanzi viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT wangjianyu viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT wanyingcai viewpointplanningwithtransitionmanagementforactiveobjectrecognition AT fushuangfei viewpointplanningwithtransitionmanagementforactiveobjectrecognition |