Cargando…

Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network

Region proposal network (RPN) based trackers employ the classification and regression block to generate the proposals, the proposal that contains the highest similarity score is formulated to be the groundtruth candidate of next frame. However, region proposal network based trackers cannot make the...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Ximing, Luo, Shujuan, Fan, Xuewu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506765/
https://www.ncbi.nlm.nih.gov/pubmed/32858907
http://dx.doi.org/10.3390/s20174810
_version_ 1783585089164148736
author Zhang, Ximing
Luo, Shujuan
Fan, Xuewu
author_facet Zhang, Ximing
Luo, Shujuan
Fan, Xuewu
author_sort Zhang, Ximing
collection PubMed
description Region proposal network (RPN) based trackers employ the classification and regression block to generate the proposals, the proposal that contains the highest similarity score is formulated to be the groundtruth candidate of next frame. However, region proposal network based trackers cannot make the best of the features from different convolutional layers, and the original loss function cannot alleviate the data imbalance issue of the training procedure. We propose the Spatial Cascaded Transformed RPN to combine the RPN and STN (spatial transformer network) together, in order to successfully obtain the proposals of high quality, which can simultaneously improves the robustness. The STN can transfer the spatial transformed features though different stages, which extends the spatial representation capability of such networks handling complex scenarios such as scale variation and affine transformation. We break the restriction though an easy samples penalization loss (shrinkage loss) instead of smooth L1 function. Moreover, we perform the multi-cue proposals re-ranking to guarantee the accuracy of the proposed tracker. We extensively prove the effectiveness of our proposed method on the ablation studies of the tracking datasets, which include OTB-2015 (Object Tracking Benchmark 2015), VOT-2018 (Visual Object Tracking 2018), LaSOT (Large Scale Single Object Tracking), TrackingNet (A Large-Scale Dataset and Benchmark for Object Tracking in the Wild) and UAV123 (UAV Tracking Dataset).
format Online
Article
Text
id pubmed-7506765
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75067652020-09-26 Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network Zhang, Ximing Luo, Shujuan Fan, Xuewu Sensors (Basel) Article Region proposal network (RPN) based trackers employ the classification and regression block to generate the proposals, the proposal that contains the highest similarity score is formulated to be the groundtruth candidate of next frame. However, region proposal network based trackers cannot make the best of the features from different convolutional layers, and the original loss function cannot alleviate the data imbalance issue of the training procedure. We propose the Spatial Cascaded Transformed RPN to combine the RPN and STN (spatial transformer network) together, in order to successfully obtain the proposals of high quality, which can simultaneously improves the robustness. The STN can transfer the spatial transformed features though different stages, which extends the spatial representation capability of such networks handling complex scenarios such as scale variation and affine transformation. We break the restriction though an easy samples penalization loss (shrinkage loss) instead of smooth L1 function. Moreover, we perform the multi-cue proposals re-ranking to guarantee the accuracy of the proposed tracker. We extensively prove the effectiveness of our proposed method on the ablation studies of the tracking datasets, which include OTB-2015 (Object Tracking Benchmark 2015), VOT-2018 (Visual Object Tracking 2018), LaSOT (Large Scale Single Object Tracking), TrackingNet (A Large-Scale Dataset and Benchmark for Object Tracking in the Wild) and UAV123 (UAV Tracking Dataset). MDPI 2020-08-26 /pmc/articles/PMC7506765/ /pubmed/32858907 http://dx.doi.org/10.3390/s20174810 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Ximing
Luo, Shujuan
Fan, Xuewu
Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network
title Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network
title_full Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network
title_fullStr Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network
title_full_unstemmed Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network
title_short Proposal-Based Visual Tracking Using Spatial Cascaded Transformed Region Proposal Network
title_sort proposal-based visual tracking using spatial cascaded transformed region proposal network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7506765/
https://www.ncbi.nlm.nih.gov/pubmed/32858907
http://dx.doi.org/10.3390/s20174810
work_keys_str_mv AT zhangximing proposalbasedvisualtrackingusingspatialcascadedtransformedregionproposalnetwork
AT luoshujuan proposalbasedvisualtrackingusingspatialcascadedtransformedregionproposalnetwork
AT fanxuewu proposalbasedvisualtrackingusingspatialcascadedtransformedregionproposalnetwork