Cargando…

Real-Time Visual Tracking with Variational Structure Attention Network

Online training framework based on discriminative correlation filters for visual tracking has recently shown significant improvement in both accuracy and speed. However, correlation filter-base discriminative approaches have a common problem of tracking performance degradation when the local structu...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Yeongbin, Shin, Joongchol, Park, Hasil, Paik, Joonki
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6891527/
https://www.ncbi.nlm.nih.gov/pubmed/31717609
http://dx.doi.org/10.3390/s19224904
_version_ 1783475834988789760
author Kim, Yeongbin
Shin, Joongchol
Park, Hasil
Paik, Joonki
author_facet Kim, Yeongbin
Shin, Joongchol
Park, Hasil
Paik, Joonki
author_sort Kim, Yeongbin
collection PubMed
description Online training framework based on discriminative correlation filters for visual tracking has recently shown significant improvement in both accuracy and speed. However, correlation filter-base discriminative approaches have a common problem of tracking performance degradation when the local structure of a target is distorted by the boundary effect problem. The shape distortion of the target is mainly caused by the circulant structure in the Fourier domain processing, and it makes the correlation filter learn distorted training samples. In this paper, we present a structure–attention network to preserve the target structure from the structure distortion caused by the boundary effect. More specifically, we adopt a variational auto-encoder as a structure–attention network to make various and representative target structures. We also proposed two denoising criteria using a novel reconstruction loss for variational auto-encoding framework to capture more robust structures even under the boundary condition. Through the proposed structure–attention framework, discriminative correlation filters can learn robust structure information of targets during online training with an enhanced discriminating performance and adaptability. Experimental results on major visual tracking benchmark datasets show that the proposed method produces a better or comparable performance compared with the state-of-the-art tracking methods with a real-time processing speed of more than 80 frames per second.
format Online
Article
Text
id pubmed-6891527
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-68915272019-12-18 Real-Time Visual Tracking with Variational Structure Attention Network Kim, Yeongbin Shin, Joongchol Park, Hasil Paik, Joonki Sensors (Basel) Article Online training framework based on discriminative correlation filters for visual tracking has recently shown significant improvement in both accuracy and speed. However, correlation filter-base discriminative approaches have a common problem of tracking performance degradation when the local structure of a target is distorted by the boundary effect problem. The shape distortion of the target is mainly caused by the circulant structure in the Fourier domain processing, and it makes the correlation filter learn distorted training samples. In this paper, we present a structure–attention network to preserve the target structure from the structure distortion caused by the boundary effect. More specifically, we adopt a variational auto-encoder as a structure–attention network to make various and representative target structures. We also proposed two denoising criteria using a novel reconstruction loss for variational auto-encoding framework to capture more robust structures even under the boundary condition. Through the proposed structure–attention framework, discriminative correlation filters can learn robust structure information of targets during online training with an enhanced discriminating performance and adaptability. Experimental results on major visual tracking benchmark datasets show that the proposed method produces a better or comparable performance compared with the state-of-the-art tracking methods with a real-time processing speed of more than 80 frames per second. MDPI 2019-11-09 /pmc/articles/PMC6891527/ /pubmed/31717609 http://dx.doi.org/10.3390/s19224904 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Kim, Yeongbin
Shin, Joongchol
Park, Hasil
Paik, Joonki
Real-Time Visual Tracking with Variational Structure Attention Network
title Real-Time Visual Tracking with Variational Structure Attention Network
title_full Real-Time Visual Tracking with Variational Structure Attention Network
title_fullStr Real-Time Visual Tracking with Variational Structure Attention Network
title_full_unstemmed Real-Time Visual Tracking with Variational Structure Attention Network
title_short Real-Time Visual Tracking with Variational Structure Attention Network
title_sort real-time visual tracking with variational structure attention network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6891527/
https://www.ncbi.nlm.nih.gov/pubmed/31717609
http://dx.doi.org/10.3390/s19224904
work_keys_str_mv AT kimyeongbin realtimevisualtrackingwithvariationalstructureattentionnetwork
AT shinjoongchol realtimevisualtrackingwithvariationalstructureattentionnetwork
AT parkhasil realtimevisualtrackingwithvariationalstructureattentionnetwork
AT paikjoonki realtimevisualtrackingwithvariationalstructureattentionnetwork